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5 PEPTIDE-BASED METHOD FOR MONITORING GENE EXPRESSION IN A HOST 

CELL 

* 

The present invention relates a method for monitoring the expression level of a gene 
in a host cell by modulating the activity of a regulatory biomolecule, comprising the 
' steps of: (a), transforming a cell expressing a regulatory biomolecule with a nucleic 

10 acid molecule comprising an open , reading frame encoding an interaction partner of 
said biomolecule in expressible form, wherein (i) said regulatory biomolecule is 
either a nucleic acid binding moiecule that effects its regulatory activity upon binding 
or an allosterically controlled ribonucleic acid molecule; and (ii) the interaction 
partner of the biomolecule is encoded by a nucleic acid molecule, comprising: (1) a 

15 nucleic acid sequence encoding a tagged (poly)peptide, or (2) a nucleic acid 
sequence encoding a tagged (poly)peptide or a peptide tag./a' selectable marker 
gene and additional nucleotide sequences for site specific, in-frame integration of 
said nucleic acid molecule into the coding sequence of at least one host 
(poly)peptide of interest, wherein said tag comprises ; the interacting residues of the 

20 interaction partner and (b) assessing the expression level of the gene. Furthermore, 
the present invention relates to a method of producing and/or selecting a compound 
capable of modulating the activity of a nucleic acid binding protein comprising the 
. steps of: (a) conducting a selection of compounds with the nucleic acid binding 
target protein under conditions allowing an interaction of the compound and the 

* • 

25 nucleic acid binding protein; (b) removing unspecifically bound compounds; (c) 
detecting specific binding of compounds to the nucleic add . binding target protein; . 
(d) expressing in a cell, the nucleic acid binding protein and providing in trans the 
coding sequence of at least one indicator gene, wherein said coding sequence is 
under control of the target sequence of the nucleic acid binding protein; (e) adding a 

30 candidate compound to the cell of step (d); (f) determining the amount or activity of 
the indicator protein, wherein a reduced or increased amount of indicator protein is 
indicative of compounds, capable of modulating the activity of the nucleic acid 
binding protein; (g) selecting compounds capable of modulating the activity of the 
nucleic acid binding protein. Moreover, the present invention relates to nucleic acid 

35 molecules, polypeptides, expression vectors, host cells, ensembles of host cells and 
a non-human animal comprising said host cells. Finally, the present invention 
relates to a kit comprising the nucleic acid molecule, the (poly)peptide, the vector, 
the host cell or the ensemble of host cells of the present invention in one or more 
containers. 



5 Several documents are cited throughout the text of this specification. The disclosure 
content of the documents cited herein (including any manufacturer's specifications 
instructions, etc.) is herewith incorporated by reference. 

Despite tremendous advances in our comprehension of the molecular basis of 
infectious diseases or diseases such as cancer, substantial gaps remain both in our 

10 understanding of disease pathogenesis and in the development of effective 
strategies for early diagnosis and for treatment. Proteomic approaches to 
investigate disease will overcome some of the limitations of other approaches. The 
opportunities as well as the challenges facing disease proteomics are formidable. 
Particularly promising areas of research include: delineation of altered protein 

15 expression, not only at the whole-cell or tissue levels, but alsp' in subcellular 
structures, in protein complexes and in biological fluids; the development of novel 
biomarkers for diagnosis and early detection of disease; arid the identification of 
new targets for therapeutics and the potential for accelerating drug development 
through more effective strategies to evaluate therapeutic effect and toxicity. 

20 Despite earlier predictions to the contrary, infectious diseases remain a leading 
cause of worldwide death. A complicating factor in therapy for infectious diseases is 

■ * 

the development of resistance to commonly used drugs (for example, as has 
occurred in tuberculosis), which heightens the need for developing effective new 
therapies. Interest in the application of proteomics to microbiology goes back at 
25 least two decades, with the pioneering work of Fred Neidhardt to characterize 
protein expression patterns in Escherichia coii under different growth conditions 
(VanBogelen, R. A., Schiller, E. E.; Thomas, R. D. & Neidhardt, F. C. Diagnosis of 
cellular states of microbial organisms using proteomics. Electrophoresis 20, 2149- 
2159(1999)). ' 

■ * 

30 The complete sequencing of a number of microbial genomes has provided a 
framework for characterizing the role of various proteins and their relevance for 
pathogenic persistence or for disease progression. A crucial aspect of the 
continuing fight against pathogenic organisms is the question of whether or not it will 
be possible to identify those proteins within the pathogenic proteome, the 

35 expression of which is essential for the survival of the pathogen. Many of these 



N 

<♦ 



5 proteins are likely to be tissue specific proteins because an optimal adaptation to the 
local conditions within the infected host is probably the most important prerequisite 

• * * 

for survival of the pathogen. Due to their outstanding relevance for survival, these 
tissue-specific proteins are likely to be prime targets for drug development. 

• _ 

At present, the characterization of tissue or environment specific gene expression is 
10 primarily based on biochemical analysis of individual proteins or protein . sets [see 

* 

(Hanash, 2003) and (Jain, 2000) for reviews and references, cited therein]. For 
proteome analysis, pathogenic microorganisms are often grown in vitro (in a 
chemostat, for example) under conditions known or assumed to mimic conditions in 
a host organism [Streptococcus mutans - (Len et a!., 2003); Salmonella typhimurium 

15 - (Deiwick et al., 2002); Pseudomonas aeruginosa - (Guina et al., 2003); Leishmania 
infantum - (El Fakhry et al., 2002); Mycobacterium tuberculosis 7 (Rosenkrands et 
al., 2002)]. The quest for identifying new tissue- or disease-specific proteins is 
hampered by the fact that methods like tissue micro-dissection, Protein Chip Array 
' technologies, MALDI-TOF mass spectrometry or mass spectrometric analysis of in 

20 situ tissue sections are time-consuming, expensive and technically demanding 
(Hanash, 2003; von Eggeling et al., 2000);. they are also often limited by the 
presence of buffer components in biological samples, the heterogeneity of the 
protein sample or the lack of adequate amounts of sample for analysis. Tagging of 
alJ identified ORFs allows either separation of cells expressing a tagged protein by 

25. FACS from those not expressing a tagged protein and quantification in the case of 

• * 

highly-expressed or strongly-localized proteinsin the case of a GFP-tag (Huh et al., 
2003) or a precise quantification of a tagged protein's expression level in the case of 
a TAP-tag (Ghaemmaghami et al., 2003). Thus, quantification and visual detection 

■ 

still requires the introduction of two independent tags per protein. While the in vitro 

¥ 

30 manipulation of growth conditions allows the isolation of large amounts of protein 

■ 

samples, this must not faithfully represent the situation in a living host organism. 

Thus, the technical problem underlying the present invention was to provide means 
and methods for identifying gene expression under specific growth conditions 
including tissue specific gene expression in a host cell or in an in vivo animal model 
35 for a disease. The solution to this technical problem is achieved by providing the 
embodiments characterized in the claims. 
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to a method for monitoring the expression 
level of a gene in a host cell by modulating the activity of a regulatory biomolecule 
comprising the steps of: (a) transforming a cell expressing a regulatory biomolecule 
with a nucleic acid molecule comprising an open reading frame encoding an 
interaction partner of said biomolecule in expressible form, wherein (i) said 
regulatory biomolecule is either a nucleic acid binding molecule that effects its 
regulatory activity upon binding or an allosterically controlled ribonucleic acid 
molecule; and (ii) the interaction partner of the biomolecule is encoded by a nucleic 
acid molecule comprising: (1) a nucleic acid sequence encoding a tagged 
(polypeptide, or (2) a nucleic acid sequence encoding a tagged (polypeptide or a 
15 peptide tag, a selectable marker gene and additional nucleotide sequences for site 
specific, in-frame integration of said nucleic acid molecule into the coding sequence 
of at least one host (poly)peptide of interest, wherein said tag', comprises the 
interacting residues of the interaction partner and (b) assessing the expression level 
of the, thus, tagged gene. 

20 The term "(polypeptide" refers alternatively to peptide or to polypeptide. Peptides 
• conventionally are covalently linked amino acids of up to 30 residues, whereas 
polypeptides (also referred to as. "proteins") comprise 31 and more amino acid 
residues. 

i * 

♦ 

The term "biomolecule" relates, to a (polypeptide or an RNA molecule or a 
5 ribonucleoprotein which is capable of modulating the amount of gene expression, 
preferably the amount of protein expression. "Modulating gene expression" means 
inhibiting or enhancing gene expression. An example of a biomolecule which is 

gene expression is the Lac repressor. By binding 
to its recognition site, the Lac repressor is capable of inhibiting the expression of 
0 open reading frames which are located downstream of this recognition site. 
Alternatively, the biomolecule can be an enhancer or activator of gene expression. 
An example of an enhancer capable of increasing gene expression is the AraC 
protein or the cAMP responsive protein (CRP). Preferably, the biomolecule. is a 
nucleic acid binding biomolecule. More preferably, the biomolecule is a nucleic acid 

■ * * 

5 binding (poly)peptide selected from the group consisting of regulator of transcription, 
regulator of translation, recombinase, (poly)peptide involved in RNA transport or a 



5 .• 

5 ribozyme capable of regulating transcription or translation. Even more preferred are 
(poiy)peptides comprising the (poly)peptide sequence of tetracycline repressor; Lac 
repressor; Xylose repressor; AraC protein; TetR-based T-Rex system (Yao et al., 
1998, distributed by Invitrogen); erythromycin-specific repressor MphR(A) (Weber 

2002) ; Pip (pristinamycin interacting protein, Fussenegger et al., 2000); ScbR 
10 (quorum sensing regulatory protein from Streptomyces coelicolor, Weber, et al., 

2003) ; TraR (quorum sensing regulatory protein of Agrpbacterium tumefaciens); 
fused to the eukaryotic activation domain p65 of NFkB (Neddermann et al., 2003); 
chimeric proteins consisting of the: (i) Gal4 DNA-binding domain and either a full- 
length PhyA protein (PhyA-GBD) or the N-terminal photosensory domain of PhyB 

15 [PhyB(NT)-GBD] of Arabidopsis thaliana (ShimizuSatp et al., 2002); (ii) steroid 
hormone regulated systems like the GeneSwitch regulatory, system/from Valentis 

• * - * 

(www.geneswitch.com), sold by Invitrogen (catalog number K1 060-02) and 
consisting of (1) a Gal4 DNA-binding domain fused to a human progesterone 
receptor ligand binding domain and an NFicB-derived p65 eukaryotic transcription 

• * 

20 activation domain and (2) the inducer mifepristone; (iii) dimerizer systems from 
ARIAD consisting of (1) a ZFHD1 DNA-binding. domain fused to FKBP 12, (2) a 
' modified form of FRAP, termed FRB, fused to the NFicB-derived p65 activation 
domain and (3) rapamycin, AP22565 or AP12967 as heterodimer-forming agents, 
as they bind to both (1) and (2); (iv) Ecdysone-lnducible Expression System (with 

25 the Inducing Agents Pon.asterone A and Muristerone A) from Invitrogen (catalog 
numbers K1001-01 K1003-01, K1004-01) containing (1) a modified form of the 
Drosophila ecdysone receptor (VgEcR) fused to a VP16 activation domain, (2). the 
mammalian homologue RXR of ultraspiracle, the natural binding partner of the 
ecdysone receptor in Drosophila and (3) the inducers ponasterone A and 

30 muristerone A. 

The term "fragment" relates to the functional domains of a protein. These may be, 
for example, a nucleic acid binding domain, a multimerization domain or an effector 
domain. The person skilled in the art knows that such . domains may be linked by 
appropriate' linkers. Although not every combination of domains may result in a 
35 functional • chimeric protein, the skilled person can easily determine whether a 

• * * 

specific chimeric protein has biological activity. In many cases no precise domain 



boundaries may exist so that the functional domain may have additional N- and C- 
terminal amino acid residues. 

Nucleic acid binding (poly) peptides are known to have at least two' states: in their 
"off-state", nucleic acid binding (polypeptides have a low affinity to their binding 
sequence on the nucleic acid molecule, whereas in their "on-state", nucleic acid 
binding (polypeptides have a high affinity for their binding sequence. Accordingly 
when the biomolecule is a nucleic acid binding (polypeptide, the term "modulating 
the activity of a biomolecule" refers to the regulation of the nucleic acid binding 
activity of the biomolecule. "Modulating the activity" can in some cases include, in 
addition to "on-states" and "off-states", a number of intermediate states which 
becomes particularly evident when the biomolecule is an allostericaliy' regulated 
protein, which are also comprised by the term "biomolecule" (see for example. 
Genes; Lewin, J. Wiley & Sons, 1983, 1 st edition, page 222, 1 st col., last paragraph 
to 2 nd col. last paragraph). The term "monitoring" . means observe the activity or 
expression level of a protein. "Monitoring" comprises direct or indirect measures. 
Preferred are indirect measures which rely on an effect of the biomolecule on, for 
example, transcription or translation of RNA. Particularly preferred are indirect 
measures based on the expression of a reporter protein. In these cases, monitoring 
means measuring the amount or activity of the reporter protein. 

The term "host cell" means eukaryotic or prokaryotic cell. The cell can be the cell of 
a uni-ceJIular . or multi-cellular organism, which may be . pathogenic or non- 
pathogenic. Preferably, the host cell is selected from mammalian, insect, nematode, 
plant, yeast, protist cell, gram-positive or gram-negative bacteria, archaebacteria or 
protozoa. Preferably the cell is a cell expressing an allostericaliy regulated 
biomolecule. More preferably, the cell is a cell expressing a biomolecule which is a 
nucleic acid binding (poly)peptide. Said biomolecule can be expressed transiently 
or, more preferable, stably. 

The term "transforming", is used herein synonymous with transfection and means 
introducing a nucleic acid molecule into a cell. Cells can be transformed by various 
methods known in the art, including methods such as electroporation, lipofection, 
viral infection, phage infection, microinjection. The term "introducing" refers to 
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5 method of transfecting or transforming a host cell with a nucleic acid molecule 
encoding, for example, a (poly)peptide comprising an interaction partner of the 
biomolecule. Introduction of the construct into the host cell can be effected by 
calcium phosphate transfection, DEAE-dextrap mediated transfection, cationic 
lipid-mediated transfection, electroporation, transduction, infection or by other 

10 methods. Such methods are described in many standard laboratory manuals, such 
as Davis et al., Basic Methods In Molecular Biology (1986) or Sambrook et al., 
"Molecular Cloning, A Laboratory Manual"; ISBN: 0879695765, CSH Press, Cold 
Spring Harbor, 2001. . 

The term "nucleic acid molecule" comprises DNA and RNA which may be single 

i' 

* • ' * 

15 stranded or double stranded, circular or linear. "Nucleic acid molecules-encoding an 

f 

interaction partner" relates to expression constructs capable of mediating protein 

.» 

expression. Said nucleic acid molecule introduced into the host cell comprises an 

■ 

open reading frame encoding an interaction partner of said biomolecule in 
expressible form. "Expressible form" implies that the nucleic acid molecule may 
20 contain regulatory sequences that allow expression of the encoded (polypeptide 

* » m 

upon introduction of the nucleic acid molecule into the host cell. 

» * 
• m 

The nucleic acid molecule introduced into the host cell may be part of an expression 
vector. Such vectors are usually specific for prokaryotic or eukaryotic expression. 
However, the nucleic acid molecule may also be part of a vector functional in 
25 prokaryotes and eukaryotes. 

* 

A typical prokaryotic expression vector usually contains a promoter element, which 
mediates the initiation of transcription of mRNA, the protein coding sequence, and 

■ • 

signals required for the termination of transcription of the transcript. Transcription, of 
DNA is dependent upon the presence of a promoter which is a DNA sequence that 

30 directs the binding of RNA polymerase and thereby promotes mRNA synthesis. The 
DNA sequences of eukaryotic promoters differ from those of prokaryotic promoters. 
Furthermore, eukaryotic promoters and accompanying genetic signals may not be 
recognized in or may not function in a prokaryotic system, and, further prokaryotic 
.. promoters are not recognized and do not function in eukaryotic cells. Similarly, 

35 translation of mRNA in prokaryotes depends upon the presence of the proper 
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5 prokaryotic signals which differ from those of eukaryotes. . Promoters vary in their 
"strength" (i.e. their ability to promote transcription). In some cases it may be 
desirable to use strong promoters in order to obtain a high . level of transcription and, 
hence, expression of a protein. Depending upon the host cell system utilized, any 
one of a number of suitable promoters may be used. For instance, when using E. 

) coli, its bacteriophages, or plasmids, promoters such as the T7 phage promoter, lac 
promoter, trp promoter, recA promoter, ribosomal RNA promoter, the P R . and P L 
promoters of coliphage lambda and others, including but not limited, to /acUV5, 
ompF, bla, Ipp, and the like, may be used to direct high levels of transcription of 
adjacent DNA segments. Additionally, a hybrid trp-/acUV5 (tac) promoter or other E. 

i coli promoters produced by recombinant DNA or other synthetic DNA techniques 
may be used to provide for transcription of the inserted gene. Specific initiation 

« 

signals are also required for efficient gene transcription and translation in 
prokaryotic cells. These transcription and translation initiation signals may vary in 
"strength" as measured by the quantity of gene specific messenger RNA and protein 
synthesized, respectively. The DNA expression vector or nucleic acid molecule 
encoding a (polypeptide, which contains a promotor, may also contain any 
combination of various "strong" transcription and/or translation initiation signals. For 
instance, efficient translation in E. coli requires an SD sequence about 7-9 bases 5' 
to the initiation codon ("ATG") to provide a ribosome binding, site. The SD 
sequences are complementary to the 3'-end of the 16S rRNA (ribosomal RNA) and 
probably promote binding of mRNA to ribosomes by duplexing with, the rRNA to 
• allow correct positioning of . the ribosome. For a review on maximizing, gene 
expression, see Roberts and Lauer, Methods in Enzymology, 68:473 (1979), which 
is hereby incorporated by reference. Thus, an SD-ATG combination that can be 
utilized by host cell ribosomes may be employed. Such combinations include but are 
not limited to the SD-ATG combination from the cro gene or the N gene of coliphage 
lambda, or from the E. coli tryptophan E, D, G, B or A genes. Additionally, any SD- 
ATG combination produced by recombinant DNA or other techniques involving 
incorporation of synthetic nucleotides may be used.' Suitable expression vectors for 
use in practicing the present invention include, for example, vectors such as pET 
expression vectors (Novagen), pQE expression vectors (Qiagen), pASK expression 
vectors (IBA), pBAD expression vectors (invitrogen), pWH1950 (Ettner et al., 1996), 



pWH1520 (MoBiTec), pWH353 with a Xyl/Tet promoter (Geissendorfer and Hiilen, 
1990), pLZ vectors with SPAC, Xyl and .Xyl/Tet promoter systems (Zhang et al., 

- 

2000) or pNZ8008 with a nisA promoter (Eichenbaum et al., 1 998). 

mm * 

A typical eukaryotic expression vector contains the promoter element, which 
mediates the initiation of transcription of mRNA, the protein coding sequence, and 
signals required for the termination of transcription and polyadenylation of the 
transcript. Additional elements include enhancers, Kozak sequences and 
intervening sequences flanked by donor and acceptor sites for RNA splicing. Highly 
efficient transcription can be achieved with the early and late promoters from SV40, 
the long terminal repeats (LTRs) from Retroviruses, e.g., RSV, HTLVI, HIVI arid the 
early promoter of the cytomegalovirus (CM V). However, cellular elements can also 
be used (e.g., the human, actin, EF-1a and ubiquitin C promoters). Suitable 

« 

expression vectors for use in practicing the present invention include, for example, 
vectors such as pSVL and pMSG (Pharmacia, Uppsala, Sweden), pRSVcat (ATCC 
37152), pSV2dhfr (ATCC 37146) and pBC12MI (ATCC 67109). Host cells that could 
be used include human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and C127 
cells, Cos 1, Cos 7 and CV1, quail QC1-3 cells, mouse L cells and Chinese hamster 

« 

ovary (CHO) cells. Alternatively, the encoded (polypeptides can be expressed in 
stable cell lines that contain a nucleic acid molecule of interest integrated into a 
chromosome. The co-transfection with a selectable marker such as dhfr, gpt, 
neomycin, hygromycin allows the identification and isolation of the transfected cells. 
The transfected gene can also be amplified to express large amounts of the 
encoded protein. The DHFR (dihydrofolate reductase) marker is useful to develop 
cfell lines that carry several hundred or even several thousand copies of the gene of 
interest. Another useful selection marker is the enzyme glutamine synthase (GS). 
(Murphy et al., Biochem J. 227:277-279 (199.1); Bebbington et al., Bio/Technology 
1 0:169-1.75 (1992)). Using these markers, the cells are grown in selective medium 
and the cells with the highest resistance are selected. These cell lines contain the 
amplified gene(s) integrated into a chromosome. Chinese hamster ovary (CHO) and 
NSO cells are often used for the production of proteins. The expression vectors pC1 
and pC4 contain the strong promoter (LTR) of the Rous Sarcoma Virus (Cullen et 
al., Molecular and Cellular Biology, 438-447 (March, 1985)) plus a fragment of the 
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CMV-enhancer (Boshart et ai, Cell 47:521-530 (1985)). Multiple cloning sites. e.g., 
with the restriction enzyme cleavage sites BamH\,.Xba\ and Asp718, facilitate the 
cloning of the gene of interest. The vectors contain in addition the 3' iritron, the 
polyadenylation and termination signal of the rat preproinsulin gene. As indicated 
above, the expression vectors will preferably include at least one selectable marker. 
Such markers include dihydrofolate reductase, Herpes simplex virus thymidine 
kinase, G418 or puromycin or zeocin resistance for eukaryotic cell culture and 
tetracycline, hygromycin, apramycin, zeocin, kanamycin or ampicillin resistance 
genes for culturing in E coll and other bacteria. Representative examples of 
appropriate hosts include, but are not limited to, bacterial cells, such as E. coli, 
15 Streptomyces and Salmonella typhimurium cells; Staphylococci aureus, 
Streptococcus pneumoniae, Corynebacterium glutamicum or Bacillus subtilis; fungal 
cells, such as yeast cells; insect cells such as Drosophila S2 and Spodoptera Sf9 
cells; animal cells such as CHO, COS, 293 and Bowes melanoma cells; and plant 
cells. Appropriate culture mediums and conditions for the above-described host cells 
20 are known in the art. 

i 

4 m 
* * • 

The term "interaction partner of said biomolecule" refers to a string of amino acid 
residues with affinity to a (polypeptide or aptamer, a.llosteric ribozyme, riboswitch 
or ribonucleoprotein. Said, string of amino acid residues may be a peptide or a 
polypeptide. Preferably, the string of amino acids with affinity to the biomolecule is 
25 part of a fusion protein. The interaction partner may be present at the carboxy 
terminal end of the fusion protein, it can, however, also be located at the amino 
terminal end or at an internal position of the amino acid sequence of the protein, 
provided that this is not associated with a negative, property such as e.g. impeding 
or destroying the biological activity etc. The larger subunit of the fusion, protein can 
be a complete protein or a mutant of a protein such as e.g. a deletion mutant or 
substitution mutant. If the biomolecule is the Tet repressor or comprises 
(poly)peptide sequences of the Tet repressor, the interaction partner is preferably 
the polypeptide of the present invention or the polypeptide expressed by the nucleic 
acid molecule of the present invention or a fusion protein comprising said 
35 (poly)peptide. 



30 
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5 This is experimentally feasible, as different sets of more than 6000 yeast strains, 
each expressing a distinct GST-ORF, GFP-ORF or TAP-ORF fusion, have already 
been generated (Ghaemmaghami et al., 2003; Huh, et al M 2003; Martzen et ah, 
1999). 

Components of tetracycline (tc)-dependent gene regulation systems are commonly 
10 . used and have been introduced into almost every eukaryotic and prokafyotic model 
organism to conditionally control the expression of individual target genes. Thus, 
transgenic organisms and cell lines containing tc-dependent regulatory systems are 
widely distributed among academic and industrial research groups. Typically, the 
expression of a gene is controlled by externally applying tc or one of its/more active 
15 derivatives, doxycycline or anhydrotetracycline. Using phage display, as disclosed in 

the present invention, we have now identified an oligopeptide that, when expressed 

« 

as a fusion with another protein like, for example, thioredoxin (Trx), can induce 
expression of a gene . under control of TetR in vivo. Such a peptide has not been 
described for TetR or any other ajlosteric regulatory protein. If the reporter gene 
20 under Tet control is sensitive and can be quantified over a wide expression range, 
like firefly liiciferase, it is possible to quantify the expression of an endogenous 
protein that has been tagged with the peptide. If the reporter gene expressed is 
green fluorescent protein, fluorescence activated cell sorting techniques can be 
used to separate cells that express the tagged protein from those that do not. In 
25 bacteria, for example, this system can be used as a genetic reporter system to 
' address questions or problems in proteomics, once a collection of mutant strains 
has been constructed that each contain a differently tagged protein, thus, 
representing the entire proteome of the respective bacterium. The "ensemble" of 
these strains can then be subjected to different growth conditions, like heat, shock, N 
30 source limitation, spbailation or the onset of stationary phase. The expression of the 
reporter gene in an individual bacterium will reflect the expression pattern of the 
respective "tagged" protein and the collection of tagged strains, thus, the "proteome" 
of the respective condition. The intensity of the reporter gene signal correlates to the 
expression level of the expressed and tagged protein/This allows the introduction of 
35 genetic methods to the field of proteomics. According to the present state of the art, 
this still requires protein-biochemical methods which are much more laborious and 

* * * 
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5 time-consuming. Tagging approaches to detect protein expression visuaily by fusing 
GFP to the C-terminus of an open reading frame (Huh et al., 2003) or to quantify 
protein expression by fusing a TAP" tag to the C-terminus of an open reaging frame 
(Ghaemmaghami et al, 2003) allow either separation of cells expressing a tagged 
protein by FACS and quantification in the case of highly expressed proteins (GFP- 
10 tag) or a precise quantification of a tagged proteins' expression level (TAP-tag). 
Thus, quantification and visual detection requires the introduction of two 
independent tags per protein. 

• * 

The invention can be used to facilitate proteome analysis. First, a bacterial 
population has to be generated that contains a set of individual bapteria that, 
together, constitute the proteome of the organism. To achieve this; every single 
bacterium in this population carries a single encoded protein that' has been tagged 
with the interacting peptide. In Bacillus subtilis or Escherichia coli, this would 

organisms, like Streptomyces 
coeiicolor, the number of strains needed would be about 7500, in Mycoplasma 
pneumoniae only about 650. This is experimentally feasible, as a set of 6144 yeast 
strains, each expressing a distinct GST-ORF fusion, has already been generated 
(Martzen et al., 1999). Automated parallel approaches have allowed generating up 
to 400 different expression vectors over a three day period. For an organism with a 
genome about the size of yeast, this time-frame would extrapolate to a few months 
25 of work (Phizicky et al., 2003). (see figure 1) 

Once the set of strains has been generated, the bacterial population carrying the 
tagged proteome can be used to explore the bacterial proteome. The bacterial 
population would be grown under different conditions, e. g. rich medium, minimal 
medium, different growth temperatures, alternative carbon or nitrogen sources, 
logarithmic phase, stationary phase, exposure to different stresses like changes in 
the osmolarity, oxygen availability, different antibiotics or bacterial toxins, to name a 
few, but not necessarily all possible altered growth conditions. This could be done in 
an array-like setup in microtiter plates to facilitate identification of expressed protein 
or, more classically, on plates. Strains that express a tagged protein under the 
35 respective environmental conditions are identified by GFP fluorescence, either 
visually or under a fluorescence-microplate reader. The corresponding proteins are 
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5 identified either by reading-out the position in the strain array or by direct 
sequencing of genomic DNA (ABI Prism 310 Genetic Analyzer User's Manual) using 
the known DNA sequence of the tag. (see figure 2) 

The teaching of the present application is particularly useful in disease proteomics. 
As of yet, proteome analysis of pathogenic microorganisms tries to mimic ex vivo 

10 the environmental conditions supposed to.be encountered by the microorganism in 
the host [(Streptococcus mutans - (Len et al., 2003); Salmonella typhimurium - 
(Deiwick et a!., 2002); Pseudomonas aeruginosa - (Guina et al., 2003); Leishmania 
infantum - (El Fakhry et al., 2002); Mycobacterium tuberculosis - (Rosenkrands et 
al., 2002)]. ft» vivo analysis of pathogenicity-specific genes is mainly conducted by 

15 microarray analysis of mRNA expression (Chan et al:, 2002; Eriksson et al.,. 2003; 
Okinaka et al., 2002; Staudinger et aU 2002) or analysis of bacterial strains - 
expressing different antisense constructs (Ji et al., 2001). The invention described 
here can facilitate the analysis of protein expression in the pathogenic organism in 
an in vivo animal disease model. The animal model is infected using one of the 

20 methods of the present invention by a bacterial population containing several copies 
of the tagged proteome. The disease is allowed to develop. At an appropriate 

♦ . 

timepoint, the animal is sacrificed, the respective organs and tissues containing the 

„ ■ » 

pathogenic microorganism are isolated, bacterial translation is arrested by addition 
of a bacteristatic translation inhibitor like chloramphenicol and the tissue (if 
25 necessary)- is homogenized. Bacteria are isolated by filtration and sorted by FACS 
(fluorescence activated cell sorting) into GFP-expressing and GFP-non-expressing 
populations. These are sorted into single wells and streaked to clonal populations. 
Due to the defined tag sequence, the individual protein that was expressed in the 

host organism can then be identified by direct genomic sequencing. 

- 

30 The teaching of the present application is also particularly useful to facilitate 
analysis of expression of proteins that are difficult to detect by conventional 
proteomics. Low-abundance proteins like transcriptional regulators or signal 
transducing proteins like kinases are often not detectable by standard proteome 
methods, even though many of these proteins are important for pathogenesis or 

35 development (Deiwick et al., 2002; Gygi et al., 2000). Induction of reporter gene 

• ■ 

expression by binding of a tagged protein to TetR leads to amplification of the initial 
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signal by allowing multiple transcription events and subsequent multiple rounds of 
translation per mRNA molecule. 

» 

* 

■ 

Moreover, the teaching of the present application is also particularly useful to allow 
exogenous control of endogenous riboswitch-controlled translational regulation 
systems (Stormo, 2003). Riboswitches have been identified in all three kingdoms 
(Sudarsan et al., 2003). Examples for molecules that bind to riboswitches are 
thiamine (Winkler et al., 2002a), S-adenosylmethiqnine (Epshtein et al., 2003; 
Winkler et al., 2003), FMN (Winkler et al., 2002b) and guanine (Mandal et al., 2003). 
These are all common metabolites and their intracellular concentration cannot be 
easily manipulated. Inducible expression of peptides that recognize these RNA 
elements and can .flip the switch would allow external control of central metabolic 
pathways without having to interfere with the natural genetic situation through the 
introduction of foreign control elements (repressors, activators) in classical 
transcription control. 

In a preferred embodiment of the present invention's method, the activity or degree 
of modulation of the activity of the biomolecule is measured via a readout system. 
The readout system can be based on the detection of a transcription product, a 
translation product or on the activity of translated (poly)peptide. Readout system can 
be any detectable product, the expression level of which is modulated by the action 
of the biomolecule. Readout may be a signal of a specific wavelength such as those 
generated by GFP or luciferase or the like. Also comprised by the present invention 
are enzyme based readout systems such as the CAT system. The readout system 
is controlled by a binding site for the regulatory biomolecuje. For example, in the 
case of a biomolecule with repressing activity, the binding site controlling the 
readout system is bound by said "biomolecule with repressing activity" and thereby 
expression from said readout system is blocked. Upon binding by the interaction 
partner disclosed in the present invention, the biomolecule is released and 
expression from the readout system Is achieved. 

In a more preferred embodiment of the present invention's method, (a) the readout 
system is provided by a nucleic acid molecule encoding a reporter protein; (b) the 
biomolecule is (i) a nucleic acid binding (poly)peptide selected from the group 
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5 consisting of regulator of transcription, regulator of translation, recombinase, 
(poly)peptide involved RNA transport or (i) an aptamer, allosteric ribozyme. or 
riboswitch arid (c) the nucleic acid binding biomolecule is allosterically 
regulated.Aptamer, allosteric ribozyme or riboswitch are RNA molecules capable of 
performing a conformational switch after ligand bindjng. Peptide binding to RNA has 

10 been demonstrated by Rev-RRE (Gosser et al M 2001; Harada et al, 1996; Harada 
et al., 1999; Zhang, et al., 2001) or Tar-TAT (Calnan et al., 1991; Weeks et al., 
1990), indicating that selection of peptides binding to a ligand. binding site is a 
feasible approach. The principle of allosteric ribozymes is outlined in (Soukup and 
Breaker, 1999) and well known examples are hammerhead ribozymes dependent 

15 on the presence of ATP (Tang and Breaker, 1997), theophylline (Soukup et al., 
2000) or cGMP/cAMP (Koizumi et al., 1999) for activity. Several examples of 
riboswitches have been described in the literature. The best described is the vitamin 
B1 -dependent ihlM switch from Escherichia coli (Winkler et al., 2002a). Two 
" examples of artificially generated ligand-controlled translational switches are 

20 described in (Suess et al., 2003) and (Werstuck and Green, 1998). The present 

... * 

invention particularly refers to the following RNA-binding (polypeptides and their 
recognition sequences which may be used in accordance with the teaching of the 
present invention and in any of the methods disclosed in the present invention: REV 

* 

binding peptide : clone 3-AAAAGRRARRRRRRRRQSCRRKMTRD (Tan and 
25 Frankel, 1998); RRE site : 5 , -UGGGCGCAGCGUCAAUGACGCUGACGGUACA-3' 
(Peterson and Feigon, 1996); . TAT peptide fragment : Tfr38- 
RKKRRQRRRPPQGSQTHQVSLSKQPTSQPRGDPTGPKE (Weeks et al., 1990); 

TAR \ , srte: 5'- 

GGUCUCUCUGGUUAGACCAGAUCUGAGCCUGGGAGCUCUCUGGCUAACUAG 
30 AGAACCC-3' (Weeks et al., 1990); ATP-sensitive allosteric ribozyme : (i) ribozyme 
strand: . 5'- 
GGGCGACCCUGAUGAGUUGGGAAGAAACUGUGGCACUUCGGUGCCAGCAAC 

GAAACGGU-3'; (ii) substrate strand: 5'-GCCGUAGGUUGCCC-3' (Tang and 
Breaker, 1998); theophvlline-sensitive allosteric ribozyme : cm + theo5: 5'- 
35 GGGCGACCCUGAUGAGCCUGGAUACCAGCCGAAAGGCCCUUGGCAGUUAGA 
CGAAACGGUGAAAGCCGUAGGUUGCCC-3' (Soukup et al., 2000); cGMPj: 
sensitive allosteric ribozyme : cGMP1: 5'- 
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GGGCGACCCUGAUGAGCCUUGCGAUGCAAAAAGGUGCUGACGACACAUCGA 
AACGGUGAAAGCCGUAGGUUGCCC-3' (Koizumi et.al., 1999); Escherichia_cgli 
thM riboswitch : 5'- 

GGAACCAAACGACUCGGGGUGGCCUUCUGCGUGAAGGCUGAGAAAUACCCG 

UAUCACCUGAUCUGGAUAAUGCCAGCGUAGGGAAGUCACGGACCACCAGGU 

CAUUGCUUCUUCACGUUAUGGCAGGAGCAAACUAUGCAAGUCGACCUGCUG 
GAUCCAGCGCAA-3' (Winkler et al.,. 2002a); tetracvcline-dependent translat innai 
switch : C b32 5.. 

CUUAAGGCCUGUACUGCUGCUUAAGGCCUAAAACAUACCAGAUCGCCACCC 

GCGCUUUAAUCUGGAGAGGUGAAGAAUACGACCACCUAGGCCAAAAUGGCU 
AGC-3' (Suess et al., 2003); Hoechst33258-specific translational switch :' H19 5'- 

GGUGAUCAGAUUCUGAUCCAACAGGUUAUGUAGUCUCCUACCUCUGCGCCU 
GAAGCUUGGAUCCGUCGC-3' (Werstuck and Green, 1998); 

The methods disclosed in the present invention allow to develop . peptides or 
polypeptides capable of interacting with the above-mentioned biomolecules. In 
accordance with the present invention, it is required that the interacting 
(poly)peptides not only bind to the biomolecules but also modulate their activity, 
preferably their nucleic acid binding affinity. For example by using phage display or 
antibody libraries, new interaction partners of the Tet repressor might be identified 
which are capable of reducing the affinity of the Tet repressor for its recognition 

• * 

sequence. These newly identified sequences can either, be expressed on their own 
or as part of a fusion protein and, thus, regulate the expression of the reporter 
protein located downstream of the recognition sequence of the Tet repressor. 

« • 

Although only specific examples for interaction partners of Tet . repressor are 
disclosed in the present invention, the skilled person can, based on the teaching of 
the present invention, identify modulators of any other nucleic acid binding protein. 
Generally, the interaction partner is a biological macromolecule. Preferably, new 
interaction partners are identified by using phage display or by screening antibody 
libraries. Alternatively, interaction partners may be identified by ribozyme display, 
mRNA display, cell-surface display and/or lipocalin-based libraries. Peptide libraries 
based on these scaffolds (lipocalins, cell surface display, mRNA display) have been 
constructed, published and successfully used to select target-specific peptides. 
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Phage display and combinatorial methods for generating interaction partners are 
known in the art (as described in, e.g., Ladner et al. U.S. Pat. No. 5,223,409; Kang 
et al. International Publication' No. WO 92/1.8619; Dower et al. International 
Publication No. WO 91/17271; Winter et al. International Publication WO 92/20791; 
Markland et al. International Publication No. WO 92/15679; Breitling et al. 
International Publication WO 93/01288; McCafferty et al. International Publication 

- 

No. WO 92/01047; Garrard et al. International Publication No. WO 92/09690; 
Ladner et al. International Publication No. WO 90/02809; Fuchs et al. (1991) 
Bio/Technology 9:1370-1372; Hay et al, (1992) Hum Antibod Hybridomas 3:81t85; 
Huse et al. (1989) Science 246:1275-1281; Griffths et al. (1993) EMBO J/12:725- 
734; Hawkins et al. (1992) J Mol Biol 226:889-896; Clackson et al. (1991) Nature 
352:624-628; Gram et al. (1992) PNAS 89:3576-3580; Garrad 'et al. (1991) 
Bio/Technology 9:1373-1377; Hoogenboom et al. (1991) Nuc Acid Res 19:4133- 

* * 

4137; and Barbas et al. (1991) PNAS 88:7978-7982, the contents of all of which are 
incorporated by reference herein). 

The basis for most phage display applications is the filamentous phage M1 3 from 
Escherichia coli. A surface protein, gpl II which is present in five copies per phage 
and important for infection of the bacteria can tolerate N-terminal insertions to a 
certain degree. Since the N-terminus of this protein is solvent-exposed, the 
insertions . are presented on the surface of the phage (Kay et al., 2001). The 
commercially available phage bank Ph.D.-12™ from New England Biolabs is 
preferably used to select for interaction partners. Alternatively, : Ph.D.-7™, Ph.D.- 

* 

C7C™ (New England Biolabs), FliTrx Random Peptide Display Library (Invitrogen), 
Ready-To-Use Phage Display cDNA Libraries (Spring Bioscience) or pSKAN 
Phagemid Display System (MoBiTec) may be used. Phage bank Ph.D.-12™ 

♦ 

contains ~1Q 9 different dodecapeptides fused via a flexible linker of four amino acids 

* 

to the N-terminus of the gplll protein. The in vitro selection may be performed by 
coating polystyrene tubes (NUNC Maxisorb) with purified biomolecule. Generally, 
the tubes are then incubated with the pool of M13 phages, washed several times 
and phages are eluted either specifically by addition of biomolecule or unspecifically 
by lowering the pH value. Individual M13 clones are usually picked and sequenced 
after three selection rounds. 
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Binding of individual phage clones to the biomolecule may be determined by using 
an ELISA (Enzyme Linked Immunosorbent Assay)! the phages are amplified, 
precipitated and resuspended in a small volume. Subsequently, 96-weN microtiter- 
plates may be coated with the biomoleucle, then blocked with a blocking reagent 
such as Bovine Serum Albumin, followed by incubation with increasing amounts of 
M13 phages from different isolates, several washes with buffer and, finally, 
incubated with a phage-specific monoclonal antibody covalently coupled to 
horseradish peroxidase. Addition of ABTS (2', 2'-Azino-bis(ethylbenzthiazolin-6~ 
sulfonic acid) as substrate permits the spectophotometric detection of phage-binding 

* • 

to TetR. The degree of absorption thereby serves as a quantitative indicator of 
phages bound to the target protein (Kay et al., 2001). 

r 

i 

Repressor-inducing interaction partners may be isolated as discussed'below. Since 
small oligopeptides are rapidly degraded intracellular by proteases, the peptide- 
encoding sequences may be cloned as C-terminal fusions to the Escherichia coli 
protein thioredoxin, an established carrier protein for' peptides (Park and Raines, 
2000). The thioredoxin. fusion proteins can be expressed, for example, by using a. 
plasmid containing a tac promoter under control of Lac repressor (Ettner et al., 
1996). The biomolecule of interest may be expressed constitutiyely at a low level. 
Preferably, in particular if the regulatory . biomolecule is the Tet repressor, the 
indicator strain is E. coli DH5a(Xfef50) containing the phage XtetSO (Smith and 
Bertrand, 1988) integrated in single copy into the E. coli genome. This phage 
contains a teM'/acZ' transcriptional fusion. Expression of p*-gaiactosidase is, thus, 
regulated by TetR that binds to tetO sequences located within the promoter. The 
pool of potential interaction partners is screened, for example, by plating 
transformed colonies on MacConkey agar containing IPTG. When using the above- 
indicated expression, vector under suitable conditions, an inducing interaction 
partner of the biomolecule, in this case of the Tet repressor, will lead to the 
expression of p-galactosidase, resulting in an acidification of the medium 
surrounding the colony which can be detected by its yellow color. 

■ 

Preferably, individual candidates identified by sequencing are cloned as fusion 
proteins, transformed into suitable reporter strains. For example, if the repressor is 
the Tet repressor, the bacterial strain DH5a(?ife/50) may be used and the respective 
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5 p-galactosidase activity is subsequently determined. For example, if the repressor is 
the Tet repressor, the following controls might also be included in the 
• measurements: To define the regulatory window, both the repressed state (0% - 
. TetR binds to tetO) and the fully induced state in the presence of tc (100% - TetR 
dissociates from tetO) may be determined. To exclude that the remainder of the 

10 fusion protein, i.e. the carrier protein, interacts unspecifically, i.e. by. amino acid 
residues of the carrier protein and not by amino acid residues of the interaction 
partner, a plasmid expressing the carrier protein without a peptide fusion may also 
be assayed in the presence and absence of IPTG. The p-galactosidase activities of 
the individual candidates cloned can also determined in the presence and absence 

15 of IPTG. • ./ '. 

* 
* 

Interaction sites between the interaction partner and the biomolecule may be 
determined by epitope mapping. Particularly preferred is the isolation of the 
interaction site by in vivo epitope mapping,. taking advantage of the observation that 
TetR(B/D) is not induced by the peptide. Chimeric repressor molecules may be 
20 constructed in which tetR(B) sequences are exchanged to different extents by the 
corresponding sequences from tetR(D) (Schnappinger et al., 1998). \n vivo 
inducibility is determined by TrxA-pepBs1. In the. case of Tet repressor, the results 
of inducibility profiles show that interactions between repressor and peptide are 
confined to the region from helix a8 to residue 182 in helix a10 (see table 1). The 

* 

25 loop connecting the helices a8 and a9 also appears to be important, as a chimera 
containing fefR(D) sequences at residues 1 53-1 67 is not inducible by TrxA-pepBs1 . 
The loop between helices a9 and a10 also seems to be important, as chimeras 
containing fefR(D) sequences between residues 179-184 and 180-184 are not 
inducible by TrxA-pepBs1 . 

30 In particular cases it may be desirable to increase or decrease the strength of the. 
interaction between the biomolecule and the interacting peptide or polypeptide. In 
these cases, the interaction can be modified by 

(i) altering the fusion point of the peptide with the target protein. For thioredoxiri, an 
N-terminal fusion induces TetR more sensitively than a C-terminal one (Figure17), 
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by altering the expression level of TetR, TetR is expressed constitutively to a 
higher level from a pWH1411 - derivative than from a pWH510-derivative 
(Wissmann et al, 1991). Consequently, higher amounts of N-terminally peptide- 
tagged thioredoxin are needed for full induction of 0-galactosidase activity (Figure 
1 6) and/or by 



(Hi) selectively exchanging, adding or deleting amino acid residues within the 
interacting (poly)peptide or the biomolecule. Optimal interaction may be achieved by 
using a rational design or by randomizing particular subunits of the (polypeptide or 
biomolecule. Rational design of interaction partners may first require, to obtain 
structural information of the interacting (polypeptide and the biomolecule or to use 
15 available structural information. For example, by using the three-dimensional 
information available for the Lac repressor in a method of computer modeling, new 
peptide sequences can be identified, capable of binding to and inducing a 
conformational change in the Lac repressor which will then modify the nucleic acid 
binding affinity of the Lac repressor. On the other hand, known modulators of the- 
20 biomolecules might .be used in any of the methods of the present invention. 
Examples of proteins as cofactor of a regulator, include phosphorylated Hpr and 
CcpA from Bacilli (Aung-Hilbrich et al., 2002), TnrA and feedback-inhibited ' 
glutamine synthetase from Bacillus subtilis (Wray et al., 2001), MIc and 
dephosphorylated PtsG (Lee et al., 2000), or MalT and MalY (Schreiber et al., 2000) 
25 from Escherichia coli. Moreover, known interaction partners of regulatory 
biomolecules may be expressed, for example in a phage display library, wherein 
portions of their sequence are randomized. This will allow to isolate new interaction 
partners with modified properties, either binding more tightly or less tightly to the 
biomolecule. 

present invention's method, the transformed 
nucleic acid molecule encoding the interaction partner of the regulatory biomolecule 
is a nucleic acid molecule comprising: (a) a nucleic acid sequence encoding a 
peptide tag or a tagged (poly)peptide, operatively linked to expression control 
sequences; or (b) a nucleic acid sequence encoding a tagged (poly)peptide or a 
peptide tag, a selectable marker gene and additional nucleotide sequences for site 
specific, in-frame integration of said nucleic acid molecule into the coding sequence 
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5 of at least one host (poly)peptide of interest. Constructs containing a nucleic acid 
. sequence encoding a tagged (polypeptide or a peptide tag, a selectable marker 
gene and additional nucleotide sequences for site specific, in-frame integration of 
said nucleic acid molecule into . the coding sequence of at least one host 
(poly)peptide of interest are particularly useful for generating host cells expressing 

0. tagged host proteins. Such constructs allow in-frame integration of tag encoding 
sequences into chromosomal sequences encoding, for example, a host protein. A 
DNA module that begins with the interaction tag-encoding sequence and includes a 
both excisable and selectable marker (present on plasmids like pKD3, pKD4, pKD13 
. (Datsenko and Wanner, 2000)) may be amplified by a standard polymerase chain 

■ 

5 reaction with primers that carry extensions homologous to the C-terminal end of the 
targeted gene and to a region downstream of it in the host genome. Transformation 
of a strain expressing bacteriophage A red functions [encoded; for example, on a 

• _ » 

plasmid like pKD20 (Datsenko and Wanner, 2000)] yields recombinants carrying the 
targeted gene fused to the interaction tag-encoding sequence [(Uzzau et al., 2001); 

■ 

>0 a modification of a. method originally described by Datsenko and Wanner (2000)]. 
The selectable marker may be, but is not limited to, an expression cassette 
conferring resistance against an antibiotic like ampicillin, apramyin, 

♦ 

chloramphenicol, erythromycin, hygromycin, kanamycin, tetracycline, or zeozin. It is 
flanked by recognition sites for site-specific recombinases, like FRT/Flp [expressed 
25 from a plasmid like pCP20 (Cherepanov and Wackernagel, 1995)] or loxP/Cre. The 

* 

PCR primers are between 56 to 60 nucleotides long, (i) anneal to constant regions 
of 20 to 21 nucleotides of the amplification template and (ii) carry extensions of 36 to 
40 nucleotides that are identical in sequence to the last portion of the gene to be 
. targeted (downstream primer) and to a region downstream of the gene (upstream 

30 primer). The PCR fragments are purified and introduced into the strain of interest 
and selected for recombinants carrying the amplified sequence in the genome. As a 
result of this targeted insertion, the host protein contains an N-terminal, C-terminal 
or internal amino acid sequence capable of interacting with the regulatory 
biomolecule. Upon expression of the host protein, the biomolecule will interact with 

35 the tagged host protein. In cases when the biomolecule is a nucleic acid binding 
(poly)peptide, this interaction results in a modulation of the biomolecules' affinity to 
its recognition sequence. For example a repressor might be switched into its 
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5 inactive state and thereby, allow transcription of the gene controlled by the 
repressor. 

a 

* - ■ 

As outlined above; the interaction partner of the biomolecule may be a peptide tag 
or a tagged (poly)peptide. A tagged (poly)peptide is a fusion protein containing 
amino acid residues capable of interacting with the biomolecule. Preferred fusion 
peptides are those encoded by the nucleic acid molecules of the present invention 
in particular when the biomolecule is the Tet repressor or comprises a fragment of 
the Tet repressor. The interaction partner may be located at N- or C-terminal 
position of the fusion peptide or may be located at an internal position, provided that 
the activity of the biomolecule is not affected by the presence of the interaction 
partner. In some cases, it may be desirable to add the interaction partner externally, 
i.e. by adding, for example, the peptide or polypeptide to the culture' medium of the 
cell. In these cases the interaction partner also may be a peptide attached to a 
carrier polypeptide. Preferably, said attachment is a covalent attachment, for 
example achieved by crosslinking the interacting peptide with the carrier polyprotein. 
The conditions and reagents required for crosslinking are known to the person 
skilled in the art. A comprehensive product guide and reagents for crosslinking are, 
for example, obtainable from Pierce. Biotechnology, Inc., P.O. Box 117,.Rockford, IL, 
61105, USA, 

In another preferred embodiment of the present invention, the nucleic acid binding 
(poly)peptide comprises the (poly)peptide sequence of (a) Tet repressor; (b) lac 
repressor; (c) Xylose repressor; (d) AraC protein; (e) TetR-based T-Rex system; (f) 
erythromycin-specific repressor MphR(A); (g) Pip (pristinamycin interacting protein); 
(h) ScbR from Streptomyces coelicolor, (i) TraR of Agrobacterium tumefaciens, 
fused to the eukaryotic activation domain p65 of NFkB; (j) chimeric proteins 
consisting of the: (i) Gal4 DNA-binding domain and either a full-length PhyA protein 
(PhyA-GBD) or the N-terminal photosensory domain of PhyB [PhyB(NT)-GBD] of 
Arabrdopsis thaliana; (ii) steroid hormone regulated systems like the GeneSwitch 
regulatory system from Valentis (www.geneswitch.com), sold by Invitrogen (catalog 
number K1 060-02) and consisting of (1) a Gal4 DNA-binding domain fused to a 
human progesterone receptor ligand binding domain and an NFKB-derived p65 
eukaryotic transcription activation domain and (2) the inducer mifepristone; (Hi) 
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dimerizer - systems from ARIAD consisting of (1) a ZFHD1 DNA-binding domain 
fused to FKBP 12, (2) a modified form of FRAP, termed FRB, fused to the NFkB- 
derived p65 activation domain and (3) rapamycin, AP22565 or AP12967 as 
heterodimer-forming agents, as they bind to both (1) and (2); (iv) Ecdysone- 
Inducible Expression System (with the Inducing Agents Ponasterone A and 
Muristerone A) from Invitrogert (catalog numbers K1001-01 K1003-01, K1004-01) 
containing (1) a modified form of the Drosophila ecdysone receptor (VgEcR) fused 
to a VP16 activation domain, (2) the mammalian homologue RXR of ultraspiracle, 
the natural binding partner of the ecdysone receptor in Drosophila and (3) the 
inducers ponasterone A and muristerone A. Particular examples of (poly)peptides, 

♦ 

which may be used in accordance with the teaching of the present invention are the 
following (polypeptides: (a) Tet repressor: GenBank accession number: X00694 
MSRLDKSKVINSALELLNEVGIEGLTTRKLAQKLGVEQPTLYWHVKNKRALLDALAI 
EMLDRHHTHFCPLEGESWQDFLRNNAKSFRCALLSHRDGAKVHLGTRPTEKQYE 
TLENQLAFLCQQGFSLENALYALSAVGHFTLGCVLEDQEHQVAKEERETPTTDSM 
. PPLLRQAIELFDHQGAEPAFLFGLELIICGLEKQLKCESGS; (a-1) tTA (TetR-VP16) 

(Gossen and Bujard, 1992): 

MSRLDKSKVINSALELLNEVGIEGLTTRKLAQKLGVEQPTLYWHVKNKRALLDALAI 
EMLDRHHTHFCPLEGESWQDFLRNNAKSFRCALLSHRDGAKVHLGTRPTEKQYE 
TLENQLAFLCQQGFSLENALYALSAVGHFTLGCVLEDQEHQVAKEERETPTTDSM 
PPLLRQAIELFDHQGAEPAFLFGLELIICGLEKQLKCESGSAYSRARTKNNYGSTIE 
. GLLDLPDDDAPEEAGLAAPRLSFLPAGHTRRLSTAPPTDVSLGDELHLDGEDVAM 
AHADALDDFDLDMLGDGDSPGPGFTPHDSAPYGALDMADFEFEQMFTDALGIDE 

YGG; (a-2) tTA2 (TetR-FFF) (Baron et al., 1997): 
MSRLDKSKViNSALELLNEVGlEGLTTRKLAQKLGVEQPTLYWHVKNKRALLDALAI 

EMLDRHHTHFCPLEGESWQDFLRNNAKSFRCALLSHRDGAKVHLGTRPTEKQYE 
TLENQLAFLCQQGFSLENALYALSAVGHFTLGCVLEDQEHQVAKEERETPTTDSM 
PPLLRQAIELFDHQGAEPAFLFGLELIICGLEKQLKCESGGPADALDDFDLDMLPAD. 

ALDDFDLDMLPADALDDFDLDMLPG; (a-3) tTA-p65 (TetR-p65) (Urlinger et al., 

* • 

2000): 

MSRLDKSKVINSALELLNEVGIEGLTTRKLAQKLGVEQPTLYWHVKNKRALLDALAI 
EMLDRHHTHFCPLEGESWQDFLRNKAKSFRCALLSHRDGAKVHLGTRPTEKQYE 
TLENQLAFLCQQGFSLENALYALSAVGHFTLGCVLEDQEHQVAKEERETPTTDSM 
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PPLLRQAIELFDHQGAEPAFLFGLELIICGLEKQLKCESGSSEFQYLPDTDDRHRIE 

EKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSL 
STI N YDE FPTM VFPS G Q I S OAS ALAP AP PQ VLPQ APAPAP APAM VS ALAQAP AP VP 

VLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDL 

ASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAP 

GLPNGLLSGDEDFSSIADMDFSALLSQISS; (b) Lac repressor: GenBank 

accession number: J01636: 

M KPVTLYDVAEYAG VSYQTVSRWNQ AS H VSAKTREKVE AAMAE LN Yl PN RVAQQ 

LAGKQSLLIGVATSSLALHAPSQIVAAIKSRADQLGASNAA/SMVERSGVEACKAAV 

HNLLAQRVSGLilNYPLDDQDAIAVEAACTNVPALFLbvSDQTPiNSIIFSHEDGTRL 

GVEHLVALGHQQIALLAGPLSSVSARLRl^GWHKYLTRNQIQPIAEREGDyVSAMS 

GFQQTMQMLNEGIVPTAMLVANDQMALGAMRAITESGLRVGADISWGYDDTEDS 

SCYIPPSTTIKQDFRLLGQTSVDRLLQLSQGQAVKGNQLLPVSLVKRKTTLAPNTQ 

TASPRALADSLMQLARQVSRLESGQ; (c) Xylose repressor: GenBank accession 

number: ' NC000964: 

MTGLNKSTySSQVNTLMKESMVFEIGQGQSSGGRRPVIVlLVFNKKAGYSVGIDVG 

VDYINGILTDLEGTIVLDQYRHLESNSPEITKDILIDMIHHFITQMPQSPYGFIGIGICV 

PGLIDKDQKIVFTPNSNWRDIDLKSSIQEKYNVSVFIENEANAGAYGEKLFGAAKNH 

DNIIYVSISTGIGIGVIINNHLYRGVSGFSGEMGHMTIDFNGPKGSCGNRGCWELYA 

sekallkslqtkekklsyqdiinlahlndigtlnalqnfgfylgigltniLntfnpqav 

* » • 

ILRNSIIESHPMVLNSMRSEVSSRVYSQLGNSYELLPSSLGQNAPALGMSSIVIDHF 
LDMITM; 

(d) AraC protein: GenBank accession number: J01641 

IVIAEAQNDPLLPGYSFNAHLVAGLTPIEANGYLDFFIDRPLGMKGYILNLTIRGQGW. 

KNQGREFVCRPGDILLFPPGEIHHYGRHPEAREWYHQWVYFRPRAYWHEWLNW 

PSIFANTGFFRPDEAHQPHFSDLFGQIINAGQGEGRYSELLAINLLEQLLLRRMEAI 

NESLHPPMDNRVREACQYISDHLADSNFDIASVAQHVCLSPSRLSHLFRQQLGISV 

LSWREDQRISQAKLLLSTTRMPIATVGRNVGFDDQLYFSRVFKKCTGASPSEFRA 
GCEEKVNDVAVKLS; (e) TetR-based T-Rex system: the TetR sequence is 
identical to that in (a); 

(f) repressor protein MphR(A): GenBank accession number: AB038042: 
MPRPKLKSDDEVLEAATWLKRCGPIEFTLSGVAKEVGLSRAALIQRFTNRDTLLVR 
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MMERGVEQVRHYLNAIPIGAGPQGLWEFLQVLVRSMNTRNDFSVNYLISWYELQV 
PELRTLAIQRNRAWEGIRKRLPPGAPAAAELLLHSVIAGATMQWAVDPDGELADH 
VLAQIAAILCLMfPEHDDFQLLQAHA; . (g) . Streptomyces ' coelicolor 
transcriptional regulator Pip: . GenBank accession number AF1 93856 
MMSRGEVRMAKAGREGPRDSVWLSGEGRRGGRRGRQPSGLDRDRITGVTVRLL 
DTEGLTGFSMHRLAAELNVTAMSVYWYVDTKDQLLELALDAVFGELRHPDPDAGL 
DWREELRALARENRALLVRHPWSSRLVGTYLNIGPHSLAFSRAVQNWRRSGLPA 
HRLTGAISAVFQFVYGYGTIEGRFLARVADTGLSPEEYFQDSMTAVTEVPDTAGVI 

EDAQDIMAARGGDTVAEMLDRDFEFALDLLVAGIDAMVEQA; (h) Streptomyces 
coelicolor transcriptional regulator ScbR: GenBank accession number 
AJ0Q7731: 

MAKQDRAIRTRQTILDAAAQVFEKQGYQAATITEILKVAGVTKGALYFHFQSKEELA 
LGVFDAQEPPQAVPEQPLRLQELIDMGMLFCHRLRTNWARAGVRLSMDQQAHG 
LDRRGPFRRWHETLLKLLNQAKENGE.LLPHWTTDSADLYVGTFAGIQWSQTVS 
DYQDLEHRYALLQKHILPAIAVPSVLAALDLSEERGARLAAELAPTGKD;(i) TraR- 

» 

p65 fusion (Neddermann et al M 2003): • 

mefqylpdtddrhrieekrkrtyetfksimkkspfsgptdprppprriavpsrssa 
svpkpapqpypftsslstinydefptmvfpsgqisqasalapappqvlpqapapap 

ap am vs alaqap ap vpvlapg p pqavap pap kptq ag egtls e allq lqf d d e dl 
gAllgnstdpavftdlasvdnsefqqllnqgipvaphttepmlmeypeaitrlvtg 

aqrppdpapaplgapglpngllsgdedfssiadmdfsallsqissgsargvpkkk 
rkvgiqegisaasrsmqhwldkltdlaaiegdecilktgladiadhfgftgyaylhi . 
qhrhitavtnyhrqwqstyfdkkfealdpwkrarsrkhiftwsgeherptlskd 
erafydhasdfgirsgitipiktangfmsmftmasdkpvidldreidavaaaatigqi 
harisflrttptaedaacvdpkeatylrwiavgktmeeiadvegvkynsvrvklre 

* • 

RMKRFDVRSKAHLTALAIRRKLl; 0:0 Tne . skilled person knows that almost any 
fusion of the Phy sequences to a Gal4-DBD (aa's 1-63, 1-95, 1-141) would be 
active, (j-ii) Gal4-hpr-p65 from pSwitch (Invitrogen: Geneswitch_man.pdf): 
MDSQQPDLKLLSSIEQACDICRLKKLKCSKEKPKCAKCLKNNWECRYSPKTKRSPL 
TRAHLTEVESRLERLEQLFLLlFPREDLDMILkMDSLQDIKALLEFPGVDQKKFNKV 
RWRALDAVALPQP VGVPN ESQALSQRFJFS PGQDI QLl PPLI NLLMSI EPDVIYAG 
HDNTKPDTSSSLLTSLNQLGERQLLSWKWSKSLPGFRNLHIDDQITLIQYSWMSL 
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♦ 

MVFGLGWRSYKHVSGQMLYFAPDLILNEQRMKESSFYSLCLTMWQIPQEFVKLQV 

SQEEFLCMKVLLLLNTIPLEGLRSQTQFEEMRSSYIRELIKAIGLRQKGWSSSQRF 

YQLTKLLDNLHDLVKQLHLYCLNTFIQSRALSVEFPEMMSEVIAGSTPMEFQYLPDT 

DDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQP 

YPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALA 

QAPAPVPVLAPGPPOAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTD 

PAVFTDLASVDNSEFQQLLNQG1PVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAP 

APLGAPGLPNGLLSGDEDFSSIADMDFSjALLSQISS; (j-»i-1) ZHFD1-FKBP fusion 

(www.ariad.com/regulationkits) 

MDYPAAKRVKLDSRERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQGRICMR 

NFSRSDHLTTHIRTHTGGGRRRKKRTSIETNIRVALEKSFLENQKPTSEEin^MIADQL 

NMEKEVIRVWFCNRRQKEKRINTRGVQVETISPGDGRTFPKRGQTCWHYTGMLE 

DGKKFDSSRDRNKPFKFMLGKQEVIRGWEEGVAQMSVGQRAKLTISPDYAYGAT 

GHPGIIPPHATLVFDVELLKLEVEGVQVETISPGDGRTFPKRGQTCWHYTGMLED 

gkkfdssrdrnkpfkfmlgkqevirgweegvaqmsvgqrakLtispdyaygatg 

HPGIIPPHATLVFDVELLKLETRGVQVETISPGDGRTFPKRGQTCWHYTGMLEDG 

KKFDSSRDRNKPFKFMLGKQEVIRGWEEGVAQMSVGQRAKLTISPDYAYGATGH 
PGIIPPHATLVFDVELLKLETSY; (j-m-2) FRB-p65 fusion 

(www.ariad.com/requlationkits 1 > : 

MDYPAAKRVKLDSRILWHEMWHEGLEEASRLYFGERNVKGMFEVLEPLHAMMER 

* * » * 

GPQTLKETSFNQAYGRDLMEAQEWCRKYMKSGNVKDLLQAWDLYYHVFRRISKT 

RDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAIVIVSALAQAPAPVPVLA 

PGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASV 

DNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLP.N 
GLLSGDEDFSSIADMDFSALLSQISSTSY ; (j-'V-1 ) VgEcR from pVgRXR 
(http://www.ihvitrogen.com/content/sfs/vectors/pvgrxr.pdf): 

♦ 

MAPPTDVSLGDELHLDGEDVAMAHADALDDFDLDMLGDGDSPGPGFTPHDSAPY 

♦ 

GALDMADFEFEQMFTDALGIDEYGGKLLGTSRRISNSISSGRDDLSPSSSLNGYSA 

NESCDAKKSKKGPAPRVQEELCLVCGDRASGYHYNALTCGSCKVFFRRSVTKSA 

WCCKFGF^CEMDMYMRRKCQECRLKKCLAVGMRPECWPENQCAMKRREEKA 

QKEKDKMTTSPSSQHGGNGSLASGGGQDFVKKEILDLMXCEPPQHATIPLLPDEIL 

AKCQARNIPSLTYNQLAVIYKLIWYQDGYEQPSEEDLRRIMSQPDENESQTDVSFR 
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H ITE ITI LTVQU VEFAKGLP AFTKI PQEDQITLLKACSSEVMM LRM ARRYD HSSDSIF 

FANNRSYTRDSYKMAGMADNIEDLLHFCRQMFSMKVDNVEYALLTAIVIFSDRPGL 

EKAQLVEAIQSYYIDtLRIYILNRHCGDSMSLVFYAKLLSILTELRTLGNQNAEMCFS 

LKLKNRKLPKFLEEIWDVHAIPPSVQSHLQITQEENERLERAERMRASVGGAITAGI 

DCDSASTSAAAAAAQHQPQPQPQPQPSSLTQNDSQHQTQPQLQPQLPPQLQGQ 

LQPQLQPQLQTQLQPQIQPQPQLLPVSAPVPASVTAPGSLSAVSTSSEYMGGSAA 

IGPITPATTSSITAAVTASSTTSAVPMGNGVGVGVGVGGNVSMYANAQTAMALMG 

VALHSHQEQLIGGVAVKSEHSTTA; Q-iv-2) RXR from pVgRXR 
(httD://www.invitroqen.com/content/sfs/vectors/Dvarxr.pdf) : 

MDTKHFLPLDFSTQVNSSLTSPTGRGSMAAPSLHPSLGPGIGSPGQLHSPISTLSS 

PINGMGPPFSVISSPMGPHSMSVPTTPTLGFSTGSPQLSSPMNPVSS'SEDIKPPLG 

LNGVLKVPAHPSGNMASFTKHICAICGDRSSGKHYGVYSCEGCKGFFKRTVRKDL 

TYTCRDNKDCLIDKRQRNRCQYCRYQMCLAMGMKREAVQEERQRGKDRNENEV 

ESTSSANEDVPVERILEAELAVEPKTETYVEANVGLNPSSPNDPVTNICQAADKQL 

FTLVEWAKRIPHFSELPLDDQVILLRAGWNELLIASFSHRSIAVKDGILLATGLHVHR 

NSAHSAGVGAIFDRVLTELVSKMRDMQMDKTELGCLRAIVLFNPDSKGLSNRAEV 

EALREKWASLEAYCKHKYPEQPGRFAKLLLRLPALRSIGLKCLEHLFFFKLIGDTPI 

DTFLMEMLEAPMQMT. 

Particularly preferred are active fragments of said (poly)peptides. 

* * 

In yet another preferred embodiment of the present invention, the reporter protein of 
the readout system is fi-galactosidase, CAT, li-glucuronidase; ll-xylosidase; XylE 

» * " 

(catechol dioxygenase); TreA (trehalase); GFP and variants CFP, YFP, EGFP, 
GFP+ (Scholz et al M 2000) of GFP; bacterial luciferase (luxAB); Photinus luciferase; 
Renilla luciferase; coral-derived photoproteins including DSRed, HcRed, AmCyan, 
ZsGreen, ZsYellow, AsRed; alkaline phosphatase or secreted alkaline phosphatase. 

In a preferred embodiment of the present invention's method, the reporter protein is 
a protein that confers resistance to an antibiotic. In this case, it is preferred that the 
cells be cultivated in the presence of an antibiotic so that only clones expressing the 
reporter protein are capable of propagating. Generally, the protein can mediate 
resistance to an antibiotic such as Ampicillin, Kanamycin, Chloramphenicol, 
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Tetracycline, Hygromycin, Neomycin or Methotrexate. Further examples of 
antibiotics are Penicillins: Ampicillin HCI, Ampicillin Na, Amoxycillin Na, Carbenicillin 
disodium, Penicillin G, Cephalosporins, Cefotaxim Na, Cefalexin HCI, Vancomycin, 
Cycloserine. Other examples include bacteriostatic inhibitors such as: 
Chloramphenicol, Erythromycin, Lincorhycin, Tetracycline, Spectinomycin sulfate, 
Clindamycin HCI, Chlortetracycline HCI. Additional examples are proteins that allow 
selection with bactericidal inhibitors such as those affecting protein synthesis 
irreversibly causing cell death. Aminoglycosides can be inactivated by enzymes 
such as NPT II which phosphorylates 3'-OH present on kanamycin, thus inactivating 
this antibiotic. Some aminoglycoside modifying enzymes acetyl ate the pompound 
and block their entry in to the cell. Gentamycin, Hygromycin B f/ .'kanamycin, 
Neomycin, Streptomycin, G418, Tobramycin. Proteins that allow selection with 
nucleic acid metabolism inhibitors like Rifampicin, Mitomycin C, Nalidixic acid, 
Doxorubicin HCI, 5-Flurouracil,. 6-Mercaptopurine, Antimetabolites, Miconazole, 
Trimethoprim, Methotrexate, Metronidazole, Sulfametoxazole are also examples for 
reporter proteins. 

In a preferred embodiment of the present invention the cell or host cell is selected 
from a mammalian, insect, nematode, plant, yeast, protist cell, non-pathogenic 
Gram-positive or Gram-negative bacteria, non-pathogenic archaebacteria or non- 
pathogenic protozoa. In another preferred embodiment of the present invention the 
cell is a pathogenic organism selected from the group consisting of yeast, gram- 
positive bacteria, gram-negative bacteria, archaebacteria or protozoa. 

In yet another preferred embodiment of the present invention, all pr.a subset of the 
proteins encoded by the cell are tagged. Tagged proteins may be expressed as 
stable construct, i,e. integrated into the genome of the cell or transiently from a 
vector. Preferably, the tagged protein is expressed from its natural position in the 
genome of the respective organism. 

The present invention also relates to a method of producing and/or selecting a 
compound capable of modulating a nucleic acid binding protein comprising the 
steps of: (a) conducting a selection of compounds with the nucleic acid binding 
target protein under conditions allowing an interaction of the compound and the 
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nucleic acid binding protein; (b) removing unspecificaily bound compounds; (c) 
detecting specific binding of compounds to the nucleic acid binding target protein; 
(d) expressing in a cell, the nucleic acid binding protein and providing in trans the 

* * 

coding sequence of at least one indicator -gene, wherein said coding sequence is 
under control of the target sequence of the nucleic acid binding protein; (e) adding a 
candidate compound to the cell of step (d); (f) determining the amount or activity of 
the indicator protein, wherein a reduced or increased amount of indicator protein is 
indicative of compounds, capable, of modulating the activity of the nucleic acid 
binding protein; and (g). selecting compounds capable of modulating the . activity of 
the nucleic acid binding protein, the compound identified by this method is an 
interaction partner of said biomolecule. Preferably, said compounds are 

* 

(poly)peptides or derivatives thereof. , 

r 

• ♦ • 

In a preferred embodiment of the present invention, particularly in cases when the 
biomolecule is the Tet repressor, said compound is any of the (polypeptides 
encoded by the nucleic acid molecules of the present invention or a derivative 
thereof. Derivatives may be (poly)peptides with substitutions, deletions or additions 

i 

in order to improve the (poly)peptide's affinity and/or specificity to the biomolecule. 

» 

In addition, said compounds may contain chemical modifications in order to improve 
the solubility and/or cellular uptake of the compound. Such modifications include, 
but are not limited to (i) esterification of carboxyl groups, or (ii) esterifi cation of 
hydroxyl groups with carbon acids, or (iii) esterification of hydroxyl groups to, e.g. 
phosphates, pyrophosphates or sulfates or hemi succinates, or (iv) formation of 
pharmaceutical^ acceptable salts, or (v) formation of pharmaceutical^ acceptable 
complexes, or (vi) synthesis of pharmacologically active polymers, or (vii) 
introduction of hydrophilic moieties, or (viii) introduction/exchange of substituents on 
aromates or side chains, change of substituent pattern, or (ix) modification by 
introduction of isosteric or bioisosteric moieties, or (x) synthesis of homologous 
compounds, or (xi) introduction of branched side chains, or (xii) conversion of alkyl 
substituents to cyclic analogues, or (xiii) derivatisation of hydroxyl groups to ketales, 
acetales, or (xiv) N-acetylation to amides, phenylcarbamates, or (xv) synthesis of 
Mannich bases, imines, or transformation of ketones or aldehydes to Schiff s bases, 
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5 oximes, acetales, ketales, enolesters, oxazolidines, thiozoiidines or combinations 
thereof. 

* » 
The present invention also relates to a nucleic acid molecule encoding a 
(polypeptide comprising the sequence (a) Met - Trp - Thr - Trp - Asn - Ala - Tyr - 
Ala - Phe - Ala - Ala - Pro - Ser - Gly - Gly - Gly - Ser (SEQ ID NO: 1); (b) Trp - 
Thr - Trp - Asn - Ala - Tyr - Ala - Phe - Ala - Ala - Pro - Ser - Gly - Gly - Gly - 
Ser (SEQ ID NO: 2); (c) Trp - Thr - Trp - Asn - Ala - Tyr - Ala - Phe ,- Ala - Ala - 
Pro - Ser (SEQ ID NO: 3); (d) Ser - Gly - Gly - Ala - Trp - Thr - Trp - Asn - Ala - 
tyr - Ala - Phe - Ala - Ala - Pro - Ser - Gly - Gly - Gly - Ser (SEQ ID NO: 4); (e) 
Ser - Gly - Gly - Ala - Trp - Thr - Trp - Asn - Ala - Tyr - Ala - Phe - Ala - Ala - 
Ala - Ser - Gly - Gly - Gly - Ser (SEQ ID NO: 5); (f) Ser - Gly - Gly - Ala - Trp - 
Thr - Trp - Asn - Ala - Tyr - Ala - Phe - Ala - Ala - Pro - Ala - ,Gly - Gly - Gly - 
.Ser (SEQ ID NO: 6); (g) Ser - Gly- Gly - Ala - Trp Thr - Trp - Asn - Ala - Tyr - 
Ala - Phe - Ala - Ala - Pro - Ser - Gly r- Arg - Gly - Ser (SEQ ID NO: 7); (h) Ser - 
Gly - Gly - Ala - Trp - Thr - Trp - Asn - Ala - Tyr - Ala - Phe - Ala - Ala - Pro - 
Ser - Asp - Gly - Gly - Leu (SEQ ID NO: 8); (i) Ser - Gly - Gly - Ala - Trp - Thr - 
Trp. - Asn - Ala - Tyr - Ala - Phe - Ala - Ala - Pro - Ser - Gly - Glu - Gly - Ser 
(SEQ ID NO: 9); (j) Ser - Gly - Gly - Ala - Trp - Thr - Trp - Ash - Ala - Tyr - Ala - 
Phe - Ala - Ala - Pro - Ser - Gly - Gly - Gly -, Trp (SEQ ID NO: 1 0); (k) Ser - Gly 

- Gly - Ala - Trp - Thr - Trp - Asn - Ala .- Tyr - Ala - Phe - Ala - Ala - Pro - Ser - 
Gly - Gly - Cys - Ser (SEQ ID NO: 1 1); (I) Ser - Gly - Gly - Ala - Trp - Thr - Trp - 
Asn - Ala - Tyr - Ala - Phe - Ala - Ala - Pro - Ser - Gly - Gly - Asp - Ser (SEQ 
ID NO: 12); (m) Ser - Gly - Gly - Ala - Trp - Thr - Trp - Asn - Ala - Tyr - Ala - 
Phe -Ala -Ala -Pro -Ser -Gly -Gly -Arg -Ser (SEQ ID NO: 13); (n)Ser-Giy 

- Gly - Ala - Trp - Thr - Trp - Asn - Ala - Phe - Ala - Phe -r Ala - Ala - Pro -Ser 

- Gly Gly - Gly - Ser (SEQ ID NO: 14); or a nucleic acid molecule, the 
complementary strand of which hybridizes under stringent conditions to the nucleic 
acid molecule of any one of (a) to (n), wherein said nucleic acid molecule encodes 
an interaction partner which is capable of modulating the activity of a nucleic acid 
binding protein. 

♦ 

The term "hybridizes under stringent conditions", as used in the description of the 
present invention, is well known to the skilled artisan and corresponds to conditions 
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of high stringency. Appropriate stringent hybridization conditions for each nucleic 
acid sequence may be established by a person skilled in the art on well-known 
parameters such as temperature, composition of the nucleic acid molecules, salt 
conditions etc.; see, for example, Sambrook et al., "Molecular Cloning, A Laboratory 

* ■ 

Manual"; CSH Press, Cold Spring Harbor, 1989 or Higgins and Hames (eds.), 
"Nucleic acid hybridization, a practical approach", IRL Press, Oxford 1985, see in 

■ 

particular the chapter "Hybridization Strategy" by Britten & Davidson, 3 to 15. 
Stringent hybridization conditions are, for example, conditions comprising overnight 
incubation at 42° C in a solution comprising: 50% formamide, 5x SSC (750 mM 
NaCI, 75 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's 
solution, 10% dextran sulfate, and 20 micrograms/m! denatured, sheared salmon 
sperm DNA,' followed by washing the filters in 0.1x SSC at about ,65°. Other 
stringent hybridization conditions are for example 0.2 x SSC (0.03 M NaCI, 
0.003Msodium citrate, pH 7) bei 65°C. In addition, to achieve, even higher 
stringency, washes performed following stringent hybridization can be done at 
higher salt concentrations (e.g. 5X SSC). Note that variations in the above 
conditions may be accomplished through the inclusion and/or substitution of 

* * * 

alternate blocking reagents used to suppress background in hybridization 

- 

experiments. Typical blocking reagents include, but are not limited to, Denhardt's 
reagent, BLOTTO, heparin, denatured salmon sperm DNA, and commercially 
available proprietary, formulations. The inclusion of specific blocking reagents may 
require modification of the hybridization conditions described above, due to 
problems with compatibility. Also contemplated are nucleic acid molecules encoding 
an interaction partner of a biomolecuie, wherein the interaction partner is capable of 
modulating the activity said biomolecuie and wherein the nucleic acid molecules 
hybridize to the nucleic acid molecule encoding the biomolecuie at even lower 

♦ 

stringency hybridization, conditions. Changes in the stringency of hybridization and 
signal detection are, for example, accomplished through the manipulation of 
formamide concentration (lower_percentages of formamide result in lowered 
stringency); salt conditions, or temperature. For example, lower stringency 
conditions include an overnight incubation at 37 degree C in a solution comprising 
6X SSPE (20X SSPE = 3M NaCI; 0.2M NaH 2 P0 4 ; 0.02M EDTA, pH 7.4), 0.5% 
SDS, 30% formamide, 100 pg/ml salmon sperm blocking DNA; followed by washes 
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at 50 degree C with 1XSSPE, 0.1% SDS. In addition, to achieve even lower 
stringency, washes performed . following stringent hybridization can be done at 
higher salt concentrations (e.g. 5X SSC). Note that variations in the above 
conditions may be accomplished through the inclusion and/or substitution of 
alternate blocking reagents used to suppress background in hybridization 
experiments. Typical blocking reagents include, but are not limited to,. Denhardt's 
reagent, BLOTTO,, heparin, denatured salmon sperm DNA, and commercially 
available proprietary formulations. The inclusion of specific blocking reagents may 
require modification of the hybridization conditions described above, due to 
problems with compatibility. ■ 

- i 

Preferably, said nucleic acid molecule hybridizing to the nucleic acid molecule 
encoding the interaction partner of a biomolecule has a sequence identity of at least 
60%, 65%, 70%,75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or 99.5% when 
compared to the nucleic acid molecule encoding the interaction partner. The term 
"which is at least 80% identical" as used in the present invention, relates to 
sequence identity as determined by the Bestfit® program (Wisconsin Sequence 
Analysis Package, Version 8 for Unix,' Genetics Computer Group, University 
Research Park, 575 Science Drive, Madison, Wl 53711). Bestfit® uses the local 
homology algorithm of Smith and Waterman to find the best segment of homology 
between two sequences (Advances in Applied Mathematics 2:482-489 (1981)). 
When using Bestfit® or any other sequence alignment program to determine 
whether a particular sequence is, for instance, 80% identical to a reference 
sequence, the parameters are set, of course, such that the percentage of identity is 
calculated over the full length of the reference nucleotide of amino acid sequence 
and that gaps in homology of up to 5% of the total number of nucleotides or amino 
acids in the reference sequence are allowed. The identity between a. first sequence 
and a second sequence, also referred to as a global sequence alignment, may also 
be determined by using the FASTDB computer program based on the algorithm of 
Brutlag and colleagues (Comp. App. Biosci. 6:237-245 (1990)). in a sequence 
alignment the query and subject sequences are both DNA sequences. An RNA 
sequence can be compared by converting U's to T's. The result of said global 
sequence alignment is in percent identity. Preferred parameters used in a FASTDB 
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5 alignment of DNA sequences to calculate percent identity are: Matrix=Unitary, 
k-tuple=4, Mismatch Penalty=1, Joining Penalty=30, •• Randomization Group 
Length=0, Cutoff Score=1, Gap Penalty=5, Gap Size Penalty 0.05, Window 
Size=500 or the length of the subject nucleotide sequence, whichever is shorter. 

The present invention also relates to a (poly)peptide encoded by the nucleic acid of 
10 the present invention. Preferably, the interaction partner of the present invention 
contains one copy of the peptide sequence encoded by the nucleic acid molecule of 
the present invention. However, also encompassed by the present invention are 
(poly)peptides which contain two, three, four, five or more copies of the peptide 
sequence encoded by the nucleic acid molecules of the present invention or of any 

15 interaction partner as defined in the present invention. Also encompassed by the 

> 

present invention are (polypeptides with at least 60%, 65%, 70%,75%, 80%, 85%, 
90%, 95%, 96%, 97%, 98%, 99% or 99.5% amino acid sequence identify to the . 
(poly)peptides encoded by the nucleic acid molecules of the present invention, 
wherein these (poly)peptides are capable of interacting with a biomolecule and of 
20 modulating its function. 

.* ■ 

The present invention also relates to an expression vector, comprising an 
expression control sequence, a multiple cloning site for inserting a gene of interest 

# 

and the nucleic acid molecule of the present invention, wherein the gene of interest 
is. inserted in-frame with the ORF encoding the peptide. The term "ORF" means 
25 "open reading frame" and relates to a nucleotide sequence encoding and capable of 
expressing a (poly)peptide. 

" jh e present invention also relates to a vector comprising the nucleic acid molecule 

* * . ■ 

of the present invention. Preferably said vector is a transfer or expression vector 
selected from the group consisting of pACT2; pAS2-1; pBTM116; pBTM117; 
30 pcDNA3.1; pcDNAI; pECFP; pECFP-C1; pECFP-N1; pECFP-N2; pECFP-N3; 
pEYFP-C1; pFLAG-CMV-5 a, b, c; pGAD10; pGAD424; pGAD425; pGAD427; 

* • 

pGAD428; pGBT9; pGEX-3X1; pGEX-5X1; pGEX-6P1; pGFP; pQE30; pQE30N; 

» * 

• V * 

• pQE30-NST; pQE31 ; pQE31 N; pQE32; pQE32N; pQE60; pSE1 1 1 ; pSG5; pTK-Hyg; 
pTL1; pTL10; pTL-HAO; pTL-R A1 ; pTL-HA2; pTL-HA3; pBTM118c; pGEX-6P3;. 

♦ 

35 pACGHLT-C; pACGHLT-A; pACGHLT-B; pUP; pcDNA3.1-V5His; pMalc2x;. Said 
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5 expression vectors may particularly be BAG (bacterial artificial chromosome) and 
YAC (yeast artificial chromosome) plasmids, cosm.ids, viruses and bacteriophages 
used conventionally in genetic engineering that comprise the aforementioned 
nucleic acid. Preferably, said vector is a gene transfer or targeting vector. 

viruses such as retroviruses, vaccinia virus 

D adenovirus, adeno-associated virus; herpes viruses, or bovine papilloma virus, may 
be used for delivery of the nucleic acid, into targeted cell population. Methods which 
are well known to those skilled in the art can be used to construct recombinant viral 
vectors; see, for example, the techniques described in Sambrook et al.. Molecular 

- * 

Cloning A Laboratory Manual, Cold Spring Harbor Laboratory (1989) N.Y. and 
> Ausubel et al., Current Protocols in Molecular Biology, Green Publishing Associates 
and Wiley Interscience, N.Y. (1989). 

In yet a further preferred embodiment of the invention, the vector contains an 
additional expression cassette for a reporter protein, selected from the group 

consisting of G-galactosidase, luciferase, green fluorescent protein and variants 
thereof. 

» ■ 

» 

Preferably, said vector comprises regulatory elements for expression of said nucleic 
acid molecule. Accordingly, the nucleic acid of the invention may be operatively 
linked to expression control sequences allowing expression in eukaryotic cells. 
Expression of said nucleic acid molecule comprises transcription of the sequence of 
the nucleic acid molecule into a translatable mRNA. Regulatory elements ensuring 
expression in eukaryotic cells, preferably mammalian ceils, are well known to those 
skilled in the art. They usually comprise regulatory sequences ensuring initiation of 
transcription and, optionally, a poly-A signal ensuring termination of transcription 
and stabilization of the transcript, and/or an intron further enhancing expression of 
said nucleic acid. Additional regulatory elements may include transcriptional as well 
as translational enhancers, and/or naturally-associated or heterologous promoter 
regions. Possible regulatory elements permitting expression in eukaryotic host cells 
are the AOX1 or GAL1 promoter in. yeast or the CMV-, SV40-, RSV-promoter (Rous 
sarcoma virus), CYC1 promotor, CMV-enhancer, SV40-enhancer or a globin intron 
in mammalian and other animal cells. Beside elements which are responsible for the 
initiation of transcription such regulatory elements may also comprise transcription 
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termination signals, such as the SV40-poIy-A site or the tk-poly-A site, downstream 
of the nucleic acid molecule. Furthermore, depending on the expression system 
used leader sequences capable of directing the (poly)peptide to a cellular 
compartment or secreting it into the medium may be added to the coding sequence 
of the aforementioned nucleic acid and are well known in the art. The leader 
sequence(s) is (are) assembled in appropriate phase with translation, initiation and 
termination sequences, and preferably, a leader sequence capable, of directing 
secretion of translated protein, or a portion thereof,, into the periplasmic space or 
extracellular medium. Optionally, the heterologous sequence can encode a fusion 
protein including an C- or N-terminal identification peptide imparting desired 

« • 

characteristics, e.g., stabilization orsimplified purification of expressed recombinant 
product. In this context, suitable expression vectors are known in the art such as 
Okayama-Berg cDNA expression vector pcDVl (Pharmacia), pCDM8, pRc/CMV, 
pcDNAI, pcDNA3, the Echo™ Cloning System (Invitrogen), pSPORTI (GIBCO 
BRL) or pCl (Promega). 

■ . 

The present invention also relates.to a host cell containing the nucleic acid molecule 
of the present invention or the expression vector of the present invention. 
Preferably, the host cell is a eukaryotic pr prokaryotic cell. The cell can be the cell of 
a uni-cellular or multi-cellular organism, which may be pathogenic or non- 
pathogenic. Preferably, the host cell is selected from mammalian, insect, nematode, 
plant, yeast, protist cell, gram-positive or gram-negative bacteria, archaebacteria or 

« 

protozoa. . . 

In a preferred embodiment, the nucleic acid molecule of the. present invention is 
fused in frame to at least one chromosomal sequence encoding a (poly)peptide. In 
another preferred embodiment, the host cell of the present invention also contains 
(a) the coding sequence of a reporter protein Which is under control of a nucleic acid 
binding protein; and (b) the coding sequence of the nucleic acid binding protein of 
(a), wherein said coding sequences are operatively linked to expression control 
sequences. As a consequence of the fusion of the nucleic acid molecule of the 
present invention to the chromosomal sequence encoding a (polypeptide, said 
(poly)peptide is capable of binding to the nucleic acid binding protein which will 
result in a modulation of gene expression of the reporter protein controlled by said 
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5 nucleic acid binding protein. In cases when the nucleic acid binding protein is a 
repressor, the negative or repressing effect will be suspended so that gene 
expression can start or proceed. In cases when the nucleic acid binding protein is 
an enhancer or activator, binding of the fusion protein to the nucleic acid binding 
protein will result in activation of gene expression and in production of the reporter 

0 gene product or protein. 

• < 

In a preferred embodiment of the present invention's host cell, the nucleic acid 
binding protein is a repressor of transcription. In another preferred embodiment of 
the present invention's host cell, the nucleic acid binding protein is an enhancer of 
transcription. Preferably said enhancer of transcription is cAMP responsive protein 
5 (CRP)or AraC. : / 

♦ 

* 

In a particularly preferred embodiment of the present invention, said nucleic acid 
binding protein is a (polypeptide comprising the (polypeptide sequence of (a) Tet 
repressor; (b) lac repressor; (c) Xylose repressor, (d) AraC protein; (e) TetR-based 
T-Rex system; (f) erythromycin-specific repressor MphR(A); (g) Pip (pristinamycin 
interacting protein); (h) ScbR from Streptomyces coelicolor, (i) TraR of 
Agrobacterium tumefaciens; fused to the eukaryotic activation domain p65 of NFkB; 
or 0) chimeric proteins. Preferably said chimeric proteins consist of (i) Gal4 DNA- 
binding domain and either (1) a full-length PhyA protein (PhyA-GBD) or (2) the N- 
terminal photosensor/ domain of PhyB [PhyB(NT)-GBD] of Arabidopsis thaliaria; (ii) 
steroid hormone regulated systems like the GeneSwitch. regulatory system from 
Valentis (www.geneswitch.com), sold by Invitrogen (catalog number K1 060-02) and 
consisting of (1) a Gal4 DNA-binding domain fused to a human progesterone 

• • • 

receptor ligand binding domain and an NFicB-derived p65 eukaryotic transcription 
activation domain and (2) the inducer mifepristone; (Hi) Dimerizer systems from 
ARIAD consisting of (1) a ZFHD1 DNA-binding domain fused to FKBP 12, (2) a 
modified form of FRAP, termed FRB, fused to the NFicB-derived p65 activation 
domain and (3) rapamycin, AP22565 or AP12967 as heterodimer-forming agents, 
as they bind to both (1) and (2).or (iv) Ecdysone-Inducible Expression System (with 
the Inducing Agents Ponasterone A and Muristerone A) from Invitrogen (catalog 
numbers K1001-01 K1003-01, K1004-01) containing (1) a modified form of the 
Drosophila ecdysone receptor (VgEcR) fused to a VP16 activation domain,. (2) the 
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mammalian homologue RXR of ultraspiracle, the natural binding partner of the 
ecdysone receptor in Drosophila and (3) the inducers ponasterone A .and 
muristerone A. 

r * 

» 

* 

The present invention also relates to an ensemble of host cells of the present 
invention, wherein said ensemble comprises two or more cells, each of which 
contain at least one nucleic acid molecule fused in frame to an open reading frame 
encoding a (polypeptide. The term "ensemble of host cells" refers to a population of 
cells, wherein said population contains cells with differently tagged host proteins. 

■ I 

Preferably, an individual cell only contains one open reading, frame tagged with 
nucleic acid sequences encoding the interaction partner of the' regulatory 
biomolecule. However, in some cases it may be desirable that 2, 3, 4, 5, 6, 7, 8, 9, 
or more ORFs contain nucleic acid sequences encoding the interaction partner of 
the regulatory biomolecule. Preferably, the ensemble or population of cells in their 
sum are characterized by representing the entire proteome of the host. 

In a preferred embodiment of the present invention's host cell, the nucleic acid 
binding protein is an enhancer or activator of transcription. 

In another preferred embodiment, the ensemble of host . cells of the present 
invention contains subpopulations with different open reading frames being fused to 
said nucleic acid molecule of the present invention. In a more preferred embodiment 
of the present invention the sum of said open reading frames forms the proteome of 
the host cell. 

The present invention also relates to a non-human animal containing the host cell of 
the present invention or the ensemble of host cells of the present invention. The 
term "non-human" relates to vertebrate and invertebrate animals. Preferred animals 
are Drosophila, mouse, rat, rabbit, monkey and cat. Hosts for pathogens can be 
inoculated intravenously with pathogens (Ji et al., 1999; Nakayama et ah, 1998), by 
intraperitoneal injection (van Deursen et al., 2001), subcutaneous injection (Dubey 
et al., 2001), injection into the abdomen (Dionne et al., 2003), aerosol infection by 

* 

inhalation (Schwebach et al., 2002), by infection with natural hosts (van Deursen et 
al., 2001), or by simple feeding (Dubey et al., 2001). 
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The present invention also relates to a kit comprising (a) the nucleic acid molecule, 
(b) the (poly)peptide, (c) the vector and/or (d) the host cell or ensemble of host cells 
of the present invention and (e) instructions for use; in one or more containers. 

Finally, the present invention relates to the use of the (poly)peptide of the present 
invention, the nucleic acid molecule of the present invention, the expression vector 
of the present invention or the host cell or ensemble of host cells of the present 
invention for monitoring expression of a gene. 

■ • 

i 

/ 
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■ 

Figures: 

Figure 1: Experimental procedure for the in vitro selection. For the phage 
display (Kay et ah, 2001) experiments, an M13 phage bank (Mourez et al., 2001) of 
about 10 9 different dodecapeptides was used. After the first round of selection, 
binding phages were eluted unspecifically by low pH. In round 2,. the eluate from 

■ • 

round 1 was split into 3 different modes of elution, which were also used in round 3. 
The output phage titer was determined after each round (shown as pfu/ml for TetR- 

^ * ■ 

specific elution). The « 400x increase indicates the enrichment of binding phages. 
After the third round individual clones were isolated, sequenced and characterised 
for TetR-binding by ELISA. 

* 

Figure 2: Example for in vitro selected sequences. The first five candidates were 
obtained by unspecific elution using a glycine-containing buffer at pH2.2. The 
second five, containing the pfeptide pepBsl, were obtained by specific elution with 
4uM TetR(B). 

* 

♦ * * 

Figure 3: Characterisation of tetR-phage binding by ELISA. To test for TetR- 
phage-interactions (Kay et al., 2001), 96-well microtiter plates were coated with 
TetR(B) or bovine serum albumin (BSA) as negative control. M13-pliage clones 
were then added in increasing concentrations. After incubation and several washing 
steps, a monoclonal Anti-M13 antibody coupled to horseradish peroxidase was 
added. TetR-M 1 3 interaction was detected by dye formation, after adding the 
substrate • ABTS (2',2'-Azino-bis(ethylbenzthiazolin-6-sulfonic acid). The 
abbreviations (s) and (us) represent specific and unspecific elution, respectively, pfu 

♦ 

represents "plaque forming units". 

« 

* 

Figure 4: Design of the peptide expressing construct. The E. coli protein 
thioredoxin A (TrxA) is a small and highly soluble cytoplasmic protein which serves 
as the carrier (Park and Raines, 2000) for the in vitro selected peptides. The trxA 
gene was cloned under the expression control of Ptac (Ettner et al., 1996). The in 
vitro selected peptides were fused with a spacer to the solvent exposed C-terminus 
of TrxA. They also contained the M13 linker present in the initial selection. The 
expression of the TrxA-peptide fusion proteins is induced by addition of IPTG. 
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or the in vivo screening system. The screening was carried out 
in an E. coii strain containing the lacZ gene under let control (Smith and Bertrand, 
1988). pWHSIO/ac/o expresses TetR constitutiyely (Altschmied et al., 1988). TetR 
binds to its operator sequence and thereby prevents transcription of the reporter 
gene. p\NH61 OtrxA-pepX encodes a fusion protein (pepX stands for any in vitro 
selected peptide). Its expression is induced by addition of IPTG (Ettner et al., 1996). 
If a peptide fused to thioredoxin binds and induces TetR, li-Galactosidase is 
expressed. This can be detected on McConkey plates and quantified in LacZ 
assays. 

r 

Figure 6: McConkey plate. The E. co/i strain DH5a-Atet50 was transformed with 
plasmids shown in the table above and streaked out to single colonies on 
McConkey plates (+ 30 uM IPTG). 

« - . . 

Figure 7: LacZ assay for the TetR-inducing fusion protein TrxA-pepBs1. This 
assay was carried out using the in vivo system from figure 6. The repressed state is 
symbolized with light grey bars, the tc-induced state with dark grey bars. TrxA 
without the selected peptide was used as a negative control. Upon induction with : 
125 uM IPTG, the in vitro selected peptide pepBsl (as fusion to TrxA) induces 
TetR(B), which was used as target during the selection. In contrast, a TetR(BD) 
chimera, in which the protein core containing the inducer-binding and dimerisation 
domains of TetR(B) (shown in light grey in the cartoon) is replaced by the sequence 
of TetR from class D (shown in dark grey in the cartoon) is not induced. The single 
chain TetR (scTetR) (Krueger et al., 2003) of class B, in which the C-terminus of 
one monomer is linked to the N-terminus of the other monomer, is also induced. 

Figure 8: Identification of the region of interaction between TetR and TrxA- 
pepBsl by in vivo epitope mapping. To delimit the region of interaction, in vivo 
epitope mapping was carried out. For that purpose, we constructed a set of 17 
TetR(BD) chimeras (Schnappinger et al., 1998), in which different parts of oc4 to a10 
were replaced by the TetR(D) sequence, and characterised them for inducibility in 
vivo. We exploited the fact that TetR(B) is inducible by the peptide, but TetR(BD) is 
not - a B-D exchange which leads to a loss of induction thus indicates the 
importance of at least part of the replaced region. The in vivo inducibility of a 
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chimera is marked with + or -. Characterisation of these chimeras resulted in a 
defined region of interaction spanning from the beginning of ot8 up to the first amino 
acids of oc10. ' * 

* 

Figure 9: Structure of TetR. One monomer is shown in black, the other in dark 

• ■ _ 

grey. Mg 2+ -Tetracyclines are shown as white stick models. A class B to D exchange, 
in the portion depicted in light grey leads to a loss of inducibility and represents the 
region involved in induction by the peptide pepBsl. ■ . 

Figure 10: Expression of the peptide correlates with induction of TetR. For this 
experiment, cells were grown under uninduced and induced conditions using IPTG 
concentrations between 1 jjM and 500 pM. The viability was determined by oD 60 o 
measurement and crude lysates were. prepared to detect the expression level of the 
fusion protein. LacZ measurements were carried out to determine the induction of 
TetR. 

♦ 

A. Expression of the peptide fusion. Lane 2 of the Western blot shows TrxA 
uninduced, lane 3 expressed TrxA after induction with 500 uM IPTG. Lane 4 to 12 
show the expression of the fusion protein TrxA-pepBs1 using increasing IPTG 

• * * 

concentrations (see table). B. induction of TetR (LacZ data compared to oD 6 oo and 
. [IPTG]). The B-Galactosidase activity is depicted as a black curve. TetR induction is 
first observed using 15 uM IPTG to induce (marked with a black arrow) the 
expression of TrxA-pepBs1 . Maximum induction was reached at an IPTG 
concentration of 60 pM. Up to that point, the expression level of TrxA-pepBs1 
correlates with the induction of TetR. Higher concentrations of the fusion protein 
appear to have a negative effect on cell viability, as can be. seen by the decrease of 
the oD 60 o (see grey curve). Concomitantly, the steady state level of the fusion 
protein is diminished and the induction of TetR is also reduced. 

Figure 11: In vivo characterisation of non-inducible TetR mutants. The 

observation that all TetR(BD) chimeras shown in figure 9 are inducible by tc, but 
only a few are inducible with the peptide, provided the first indication that induction 
of TetR by the peptide is mechanistically different from induction by tc. To 
substantiate this assumption we analyzed TetR mutants (Muller et al., 1995) which 
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carry a single amino acid exchange in tc-contacting residues in vivo. The repression 
of the reporter gene by TetR is shown with light grey bars, the induction with tc with 
dark grey bars. The mutants H64Y, N82A and F86A were not or only slightly 
inducible with tetracycline but either fully or at least partially inducible with the 
peptide (see black bars). 

* 

Figure 12: Position of the amino acids H64, N82 and F86 relative to 
tetracycline and the interaction epitope. The region of TetR-peptide interaction 
mapped for each monomer is shown in light and dark grey, respectively. Tc is 
depicted as light grey stick model. The residues in the mutants, which showed the 
phenotype of being inducible by the peptide but not by tc, when exchanged, are 
grouped around the A-ring of tc. 

> ■ * 

Figure 13: Amino acids contacting tc either directly or indirectly via the hydrated 
magnesium cation (tc is shown as light grey stick model).. 

Figure 14: In vivo characterisation of TetR inducibility by TrxA fusion proteins. 

» 

A.IZ-Galactosidase measurement. This assay is carried out in E. coii DH5a-Atet50 
which carries the lacZ gene under Tet-control as a single copy integrated into the E: 
coii genome. The first bar shows the repressed state when TetR binds to its 
operator sequence tetO. After addition of tetracycline, TetR is induced and p- 
Galactosidase is expressed. TrxA fusion proteins are under transcription control of 
Ptao and induced by addition of IPTG. TrxA without the fused peptide was used as a 
control and is not capable of inducing TetR. After addition of 60 uM IPTG, the C- 
. terminal TrxA-peptide fusion induces TetR about 14-fold. The N-terminal fusion 

* • 

reaches a factor of induction of about 40 and is, thus, more effective. 

* • * 

_ * 

B .Western Biot analysis. A monoclonal TrxA antibody was used to detect TrxA and 
TrxA fusion proteins in samples which were used simultaneously for carrying out p- 
Galactosidase activity measurements and for preparing crude lysates. The data 
indicates that all proteins were expressed at the same level and the increase of 
induction seen for the N-terminal fusion is not due to a higher protein amount. 
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Figure 15: Correlation between the protein level of PepBs1-TrxA and induction 
ofTetR(B). 

pVGalactosidase measurement and Western blot analysis- were carried out as 
explained in figure 14. The N-terminal TrxA-peptide fusion was induced using IPTG 
concentrations ranging from 1 uM up to 125 uM. The Western blot data shows that 
the amount of fusion protein, correlates to the induction of TetR(B) as reflected in the 
increase in p-Galactosidase activity. A maximum of induction was reached using 15 

* * 

uM IPTG. 

■ ' • . • • • • 

Figure 16: Comparison of constitutive expression systems leading to k>w and 

medium steady-state levels of TetR and their influence on the induction profile 
by PepBs1-TrxA. 

(3-GaIactosidase measurements were carried out as detailed in figures 14 and 15 
with the exception of TetR(B) being expressed from different plasmids. pWH51 (D/ac^ 

* * » 

expresses TetR constitutively at a low level (shown in dark grey bars) whereas 
pWH^H/ac^ expresses TetR constitutively at a medium level (shown in light grey 
bars). In the low expressing system, induction of TetR is observed even for very low 
IPTG concentrations which is equivalent to a low protein level of the TrxA-peptide 
fusion (see figure 15). The maximum p-Galactosidase activity is reached at IPTG 
concentrations of 15 uM or higher. The higher amount of TetR in the medium 
expression system leads to a delayed increase in p-Galactosidase activity requiring 
the presence of higher amounts of the fusion protein for full induction. 
Consequently, the maximum of p-Galactosidase activity is reached at 60 uM IPTG 
and higher. 

Figure 17: Comparison of TetR(B) induction by C- and N-terminal TrxA-peptide 
fusions. 

* 

p-Galactosidase measurements were carried out as explained in figure 14 and using 
the low-level TetR-expression system pWH510lfac/ q . While induction of TetR by the 
C-terminal fusion is characterised by a curve having a weakly ascending slope, 
induction of TetR by the N-terminal fusion is characterised by a curve profile with a 
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5 steeply ascending slope. In addition, the N-terminal fusion leads to an about 3-fold 
higher induction compared to the C-terminal fusion. 
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The examples illustrate the invention 

Example 1 : In vitro selection of TetR-binding peptides 

Several different experimental approaches can be employed to identify novel 
ligands for proteins. The screening of low molecular weight compound libraries, 
permits the identification of small molecules that bind to a given target protein. 
These libraries can consist of a diverse collection of natural products or, 
alternatively, contain a large set of synthetic compounds generated by combinatorial 
chemistry. A completely different approach is the use of in vitro evolution methods 
(Lin and Cornish, 2002). These allow the selection of biological macromolecules that 
display affinity to a defined target molecule. The SELEX method (Systematic 
Evolution of Ligands by Exponential Enrichment) allows the isolation of nucleic 
acids, RNA or DNA, that bind a defined target with high affinity and specificity (Tuerk 

■ • 

• - ■ 

and Gold, 1990). Several methods have been developed to select peptide ligands. 
Two more recent developments are "mRNA-display" or "ribosome display" (Wilson 

* ♦ • * 

et al., 2001). A more common method is "phage display" (Gianhattasio and 
Weisbium, 2000) which was also used in our selection for TetR-biriding peptides. 

* - 

♦ 

* ■ * 

Phage display: The basis for most phage display applications is the filamentous 
phage M13 from Escherichia coli. A surface protein, gplll which is present in five 
copies per phage and important for infection of the bacteria can tolerate N-terminal 
insertions to a certain degree. Since the N-terminus of this protein is solvent- 
exposed, the insertions are presented on the surface, of the phage (Kay et al., 
2001). The commercially available phage bank Ph.D.-12™ from New England 
Biolabs was used to select for TetR binding peptides. It contains ~10 9 different 
dodecapeptides fused via a flexible linker of four amino acids to the N-terminus of 
the gplll protein. The in vitro selection was performed by coating polystyrene tubes 
(NUNC Maxisorb) with purified Tet repressor protein from class B, TetR(B) (Ettner et 
al., 1996). The tubes were then incubated with the pool of M13 phages, washed 
several times and TetR(B)-bound phages were eluted either specifically by addition 
of TetR(B) or unspecifically by lowering the pH value. Three selection cycles were 
performed and the enrichment of TetR-binding phages was monitored by 
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determining the M13 titer after each round (Figure 1). Individual M13 clones were 
picked and sequenced after the third selection round (Figure 2). 

* • • • ■ 

Immunological detection of the M13-T e tR-interaction: Binding of individual M13 
clones to TetR was determined by ELISA (Enzyme Linked Immunosorbent Assay). 
The phages were amplified, precipitated and resuspended in a small volume. 96- 
well microtiter-plates were coated with TetR(B), then blocked with Bovine Serum 
Albumin, followed by. incubation with increasing amounts of M13 phages from 
different isolates, several washes with buffer and, finally, incubated with an M13- 
specific monoclonal antibody covalently coupled to horseradish peroxidase. Addition 
Of ABTS.(2\ 2'-Aztno-bis(ethylbenzthiazolin-6-sulfonic acid) as substrate permits the 
5 spectophotometric detection of phage-binding to TetR. The degree- of absorption 
serves as a quantitative indicator of phages bound to the target protein (Kay et ak, 
2001). For the peptide pepBsl which will be discussed in more detail in the following 
sections, the A T etR/A BS A factor determined with the phage-dilution containing 1 0 10 pfu 
was 34 (Figure 3). 

< 

* * 

* 

Example 2: In vivo screening for TetR-inducIng peptides 

» 

Establishing an in vivo screen in Esch e richia coll: After having obtained and 
identified several TetR-binding peptides, the next goal was to isolate peptides that 
could induce TetR(B). Since small oligopeptides are rapidly degraded intracellular/ 
by proteases, the peptide-encoding sequences were cloned as C-terminal fusions to 
the Escherichia coli protein thioredoxin, an established carrier protein for peptides 
(Park and Raines, 2000). The thioredoxin fusion proteins were expressed by a tac 
promoter under control of Lac repressor and, thus, inducible by addition of IPTG 
(lsdpropyl-0-thiogalactoside) (Ether et ak, 1996) (see Figure 4 for illustration). 
TetR(B) was expressed constitutively at a low level (Altschmied et ak, 1988). The 
indicator strain was E. coii DH5a(Wef50) containing the phage Xtet50 (Smith and 
Bertrand, 1988) integrated in single , copy into the E. coli genome. This phage 
contains a tetA-lacZ transcriptional fusion. Expression of 0-galactosidase is, thus,. 

sequences located within the promoter (Figure 
5). The pool- of TetR-binding peptides is screened by plating transformed colonies 
on MacConkey agar containing IPTG. An inducing Trx-peptide fusion protein will 



.47 

* • • • 

» • » 

5 lead to the expression of p-galactosidase, resulting in an acidification of the medium 
surrounding the colony which can be detected by its yellow color (Figure 6, sectors 
1,2 and 4). • 

m 

Cloning of individual candidates: Individual candidates identified by sequencing 
were cloned as C-terminal fusions to TrxA, transformed into the reporter strain 
10 D.H5a(MeiS0) and the respective (3-galactosidase activities determined. The 

* * 

following controls were also included in the measurements: To define the regulatory 
window, both the repressed state (0% -TetR binds to tetO) and the fully induced 

♦ 

state in the presence of tc (100% - TetR dissociates from tetO) were determined. To 
exclude that thioredoxin is involved in the TetR-peptide interaction, a /plasmid 
1 5 expressing thioredoxin without a peptide fusion was also assayed in the presence 
and absence of IPTG. The (3-galactosidase activities of the individual candidates 
cloned were also determined in the presence and absence of IPTG. 

B-Galactosidase activity assays: A TetR-inducing Thioredoxin-peptide fusion protein 
was identified in the pool of cloned TetR-binding peptides. The fusion protein is 
20 specific, for TetR(B), since a chimeric repressor, TetR(B/D), is not induced in the 
presence of Trx-pepBs1. This chimera consists of the DNA-binding domain 
• (residues 1-50) from TetR(B) and the protein core with the inducer-binding and 
dimerization domains from TetR(D). The sequence identity between the two classes 
is 63 % at the amino acid level. A single-chain TetR(B) variant in which both 

* 

25 subunits of TetR are fused to a monomer by a flexible protein linker connecting the 
C-terminus of one subunit to the N-terminus of the other was also induced by TrxA- 
pepBsl (Figure 7). This suggests than an interaction of the peptide with the 
. dimerization surface of TetR and subsequent induction of TetR by dissociation 

t 

appear unlikely. The sequence of the peptide pepBst, with linker elements is shown 
30 in Figure 4. 

4 

Example 3: Identification of the interaction site between TetR and the inducing 
peptide 

We isolated the interaction site of TrxA-pepBs1 with TetR(B) by in vivo epitope 
mapping, taking advantage of the observation that TetR(B/D) is not induced by the 
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peptide. We therefore constructed chimeric repressors in which tetR(B) sequences 
are exchanged to different extents by the corresponding sequences from tetR(D) 
(Schnappinger et ai., 1998) and determined their in vivo inducibility by TrxA-pepBs1. 
Figures 8 and 9 summarize the results obtained. The inducibility profile shows that 
interactions between repressor and peptide are confined to the region from helix a8 
to residue 182 in helix a10. The loop connecting the helices a9 and a10 also 

• ♦ * 

• » 

appears to be important, as chimeras containing tetR(D) sequences at residues 
179-184 and 180-184 are not inducible by TrxA-pepBs.1 . 

* 

« • 

Example 4: Analysis of TetR mutants with an induction-deficient phenotype 

t 

To demonstrate that induction mediated by pepBsl follows a novel mechanism 
different than that of tetracycline, we cloned induction mutants of TetR(B) (Muller et 
al., 1995) into the in vivo test system. The . mutants contain single residue 
exchanges that lead to an induction-deficient phenotype. The mutant TetR(B)HY64 
with an exchange of the histidine residue at position 64 by tyrosine is not inducible 
by tetracycline. p-Galactosidase activity assays show, however, that it is still 
inducible by TrxA-pepBs1 (see figures 1 1 to 13 and fable 1). 

» 

» 

- 

Example 5: The inducing tag is also active as N-terminal fusion. 

The versatility of the inducing tag as a marker for protein presence would be greatly 
enhanced if it were not confined to C-terminal protein fusions , as these need not be 
active with all proteins (Huh et al.. 2003). We therefore fused the inducing peptide 
with the M13 linker element genetically to the N-terminus of thioredoxin A from E. 
coli. We expressed this fusion protein from pWH610 in the same in vivo assay 
system as shown in Figure 5. Fusion protein expression was induced by addition of 
IPTG to a final concentration of 60 uM, p-galactosidase activities were measured 
and the fusion protein amounts were determined by Western blotting of crude 
extracts. The results shown in Figures 14 and 17 clearly demonstrate that the N- 
terminal fusion is not only active, but more active than the C-terminal fusion. Its 40- 
fold induction are higher than the 14-fold induction obtained with the C-terminal 
fusion. This is not due to higher steady-state amounts of the. fusion protein, as is 
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* » 

5 evident from the Western blot data, but rather reflects an intrinsic activity of the 

* 

protein 

Example 6: The expression level of the tagged protein correlates with reporter 
gene activity- 

Quantification of the tagged protein requires a correlation between the measured 
1 0 reporter gene activity and the amount of the tagged protein expressed. To test if this 
is the case, we induced expression of the TrxA-peptide fusion protein to different 

* 

* 

extents by varying the amounts of the inducer IPTG. We then measured/ the 

A 

respective p-galactosidase activity and determined the corresponding amounts of 
the TrxA-peptide fusion protein by Western blotting with a monoclonal antibody 

t 

15 against TrxA. Figures 10 and 15 show that higher amounts of the TrxA-peptide 
fusion protein lead to. higher amounts of p-galactosidase activity. In Figure 15, a 
plateau is reached at an IPTG concentration of 15 pM, and p-galactosidase activity 
is fully induced. This. is most likely due to all TetR present in the cell having been 
bound by the tagged protein. The resolution window can be shifted by increasing the 

• • • 

20 intracellular steady-state amount of TetR. For example, pWH141 1 expresses higher 
amounts of TetR than pWH510 (Wissmann et al., 1991). Consequently, in the 

■ 

presence of pWH 1411, higher amounts of the fusion protein are needed, to reach 
the same p-galactosidase activities, as in the presence of pWH510 (compare 2.5 
pM IPTG for pWH510 with 15 pM IPTG for pWH1411), and the plateau 
25 corresponding to full induction of p-galactosidase activity is only reached at an IPTG 
concentration of 60 pM (Figure 16). 

i 

■ * 

* 

■ 

* 

Table 1: Repression and inducibility of TetR s mutants by tc or TrxA-pepBs1. 

This table summarises the TetR mutants characterised. They all carry a single 
30 amino acid exchange in residues surrounding the inducer tc arid are induction- 
deficient as TetR(B) or as TetR(D) variants in a different and less sensitive in vivo 
assay system (Muller et al., 1995, Schubert, 2001). The positions of the individual 
residues relative to tc are shown schematically in figure 13 with the exception of 

i 

D53 and E114.. Column 2 shows the repressed state in the absence of any inducer, 
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column 3 the tc induced-state. Columns 4 and 5 show the in vivo data in the 
presence of the TrxA-pepBs1 construct, whereby column 5 represents the peptide- 
induced state after induction with 60 uM IPTG. Mutants with an at least three fold 
increase in ft-Galactosidase activity compared to the repressed state are defined as 
inducible and are shown in boid. The mutants H64Y, N82A and F86A (shown in 
detail in Figures 12 and 13) are not or only slightly inducible by tc, but induced with 
the peptide (Figure 11). The mutants P105A and E114G are inducible by tc, but not 
by the peptide. The same holds true for many of the TetR(B/D) chimeras. The fact 

* " ■ ■ 

that we find no correlation between tc-inducible and peptide-inducible mutants 
supports the assumption that the mechanisms underlying peptide-induction ; ahd tc- 
induction are different. , ' ■ ' 




TetR(B) 
variant 




D53G 



H64Y 



N82A 



F86A 



H100Y 



T103A 



P105A 



E114G 



E147A 



B-Galactosidase activity [MU] 



157 ±6,2 


1740 ±319 


135 ±2,9 


47 ±21 


165 ±3,6 


125 ±2,4 


162 ±2,8 


480 ± 33 


125±11 

* 


56 ± 8,3 


142 ± 3,6 

■ 


802 ± 174 


173 ±3,4 


4085 ±35 


117±9,8 


* 

1574 ± 196 


154 ±5,8 


526 ± 1 85 



144 ±4,4 
140 ±6,2 

178 ±8,0 
151 ±11 

129 ±9,8 
154 ±9,9 
185 ±9,4 
123 ±12 
173 ±12 



1322 ±86 
1212 ±83 
618 ±8,9 
1464 ±65 
136 ± 11 
712 ±45 
210±14 
178 ±21 
629 ± 77 



» 
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Claims 



A method for monitoring the expression level of a gene In a host cell by 
modulating the activity of a regulatory biomolecule, comprising the steps of: 
(a) transforming a cell expressing a regulatory biomolecule with a nucleic 
acid molecule comprising an open reading frame encoding an 
interaction partner of said biomolecule in expressible form, wherein 
(i.) said regulatory biomolecule is either a nucleic acid binding 
molecule that effects its regulatory activity upon binding or an 

* 

allosterically controlled ribonucleic acid molecule; and 

(ii.) the interaction partner of the . biomolecule is encoded by a 
i , { ■ 

\ . nucleic acid molecule comprising: / 

(1) a nucleic acid sequence encoding a tagged (poly)peptide, 

V or • ; 

'(2) a nucleic acid sequence encoding a tagged (polypeptide 
or a peptide tag, a selectable marker gene and additional 

* 

nucleotide sequences for site specific, in-frame integration 
of said nucleic acid molecule into the coding sequence of 
at least one host (polypeptide of interest, 
wherein said tag comprises the interacting residues of the 

■ • 

interaction partner 

and 

(b) assessing the expression level of the gene. 

The method of claim 1 , wherein the activity or degree of modulation of the 
activity of the biomolecule is measured via a readout system. 

The method of claim 2, wherein 

(a) the readout system is provided by a nucleic acid molecule encoding a 
reporter protein; 

(b) the regulatory biomolecule is 

* « 

(i.) a nucleic acid binding (polypeptide selected from the group 
consisting of regulators of transcription, regulators of 



translation, recombinases, (polypeptides involved in RNA 
transport, 

■ 

(ii.) an RNA molecule selected from the groups consisting , of 
allosteric ribozymes, riboswitches or translation-regulating 
aptamers, 

(Hi.) 
and 

(c) the nucleic acid binding biomolecule is allosterically regulated. 

v * • • • 

The method of claim 2 or 3 wherein the nucleic acid binding (poly)peptide 
comprises the (poly)peptide sequence of 

(a) Tetracycline repressor; 

(b) Lac repressor; / 

(c) Xylose repressor; ' 

(d) AraC protein . '■' ■>.'' 
■ 

(e) TetR-based T-Rex system; 

« 

(f) Erythromycin-specific repressor MphR(A); 

(g) Pip (pristinamycin interacting protein); 

(h) ScbR of Streptomyces coeficolor, 

* * 

(i) TraR of Agrobacterium tumefaciens; fused to the eukaryotic activation 
domain p65 of NFkB; 

0) chimeric proteins consisting of the: 

(i.) Gal4 DNA-binding domain and either a full-length PhyA protein 
(PhyA-GBD) or the N-terminal photosensory domain of PhyB 
[PhyB(NT)-GBD] of Arabidopsis thaliana;. 
(ii.) a steroid hormone regulated system consisting of (1) a Gal4 
DNA-binding domain fused to a human progesterone receptor 
ligand binding domain and an NFicB-derived p65 eukaryotic 
transcription activation domain and (2) the inducer mifepristone 
(iii.) a dimerizer system consisting of (1) a ZFHD1 DNA-binding 
domain fused to FKBP 12, (2) FRAP fused to the NFicB-derived 
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* 

p65 activation domain arid (3) rapamycin, AP22565 or AP12967. 
as heterodimer-forming agents, or 

■ » 

(iv.) an Ecdysone-lnducible Expression System containing (1) a 
modified form of the Drosophila ecdysone receptor (VgEcR) 
fused to a VP 16 activation domain, (2) the mammalian 
. homologue RXR of ultraspiracle, the natural binding partner of 
the ecdysone receptor in Drosophila and (3) the inducers 
ponasterone A and muristerone A: 

♦ 

■ 

The method of claim 3 or 4, wherein the reporter protein of the readout 
system is fc-gaiactosidase, CAT, li-glucuronidase, fc-xylosidase, XylE 
(catechol dioxygenase), TreA (trehalase), GFP and variants CFP, YFP, 
EGFP, GFP+ , bacterial luciferase (luxAB), Photinus luciferase, Renilla 
luciferase, coral-derived photoproteins including DsRed, HcRed, AmCyan, 
ZsGreen, ZsYellow, AsRed, alkaline phosphatase' or secreted alkaline 

phosphatase. 

The method of claim 3 or 4, wherein the reporter protein is a protein that 

* * - • 

• • • 

confers resistance to an antibiotic. 

The method of any one of claims 1 to 6, wherein the cell is selected from a 
mammalian, insect, nematode, plant, yeast, protist cell, gram-positive or 
gram-negative bacteria, archaebacteria or protozoa. 

The method of any one of claims 1 to 7, wherein all or a subset of the 

♦ 

proteins encoded by the cell are tagged. 

• * 

A method, of producing and/or selecting a compound capable of modulating 
the activity of a nucleic acid binding protein comprising the steps of: 

(a) conducting a selection ' of compounds with the nucieic acid binding 
target protein under conditions allowing an interaction of the 
compound and the nucleic acid binding protein; 

(b) removing unspecificly bound compounds; 

(c) detecting specific binding of compounds to the nucleic acid binding 
target protein; 



(d) expressing in a cell, the nucleic acid binding protein and providing in 
trans the coding sequence of at least one indicator gene, wherein said 
coding sequence is under control of the target sequence of the nucleic 
acid binding protein; 

(e) adding a candidate compound to the cell of step (d); 

(f) . determining the amount or activity of the indicator protein, wherein a 

reduced or increased amount of indicator protein is indicative of 
compounds, capable of modulating the activity of the nucleic acid 
binding protein; and 

(g) selecting compounds capable of modulating the activity of the nucleic 

acid binding protein. *' 

• • •' • • " • ' ' 

A nucleic acid molecule encoding a (polypeptide comprising the sequence 

(a) Met - Trp - Thr - Trp r- Asn - Ala - Tyr - Ala - Phe - Ala - Ala - Pro 

- Ser- Gly -Gly - Gly - Ser (SEQ ID NO: 1); / . 

(b) Trp - Thr - Trp - Asn - Ala - Tyr - Ala ~ Phe - Ala - Ala - Pro - Ser 

- Gly - Gly - Gly - Ser (SEQ ID NO: 2); 

(c) Trp Thr - Trp - Ash - Ala - Tyr - Ala Phe - Ala - Ala - Pro - Ser 
(SEQ ID NO: 3); 



(d). 


Ser- 


-Gly- 


-Gly- 


-Ala- 


Trp - 


Thr- 


• Trp - 


• Asn - Ala - Tyr - Ala - 


Phe - 




Ala- 

• 


•Ala- 


Pro- 


Ser- 


-Gly-r 

* 


• Gly - 


-Gly- 


- Ser (SEQ ID NO: 4); 




(e) 


Ser- 


-Gly- 


-Gly- 


-Ala- 


Trp - 


Thr- 


Trp- 


Asn - Ala Tyr - Ala - 


Phe- 




Ala - 


Ala- 


Ala- 


Ser- 


Gly- 


Gly- 


Gly- 


Ser (SEQ ID NO: 5);- 




(f) 


Ser- 


-Gly- 


-Gly- 


-Ala- 


Trp — 


Thr- 


Trp — 


Asn - Ala - Tyr - Ala - 


Phe- 




Ala- 


Ala- 


Pro — 


Ala- 


Gly- 


Gly- 


Gly- 


Ser (SEQ ID NO: 6); . 




(g) 


Ser- 


-Gly- 


•Gly- 


•Ala- 


Trp - 


Thr- 


Trp - 


Asn - Ala - Tyr - Ala - 


Phe- 




Ala- 


Ala- 


Pro- 


Ser- 


Gly- 


Arg- 


•Gly- 


• Ser (SEQ ID NO: 7); 




(h) 


Ser- 


■Gly- 


Gly- 


■ Ala - 


Trp — 


Thr- 


Trp - 


Asn - Ala - Tyr - Ala - 


Phe- 




Ala- 


Ala- 


Pro- 


Ser- 


Asp- 


-Giy- 


-Gly- 


-Leu (SEQ ID NO: 8); 




(0 


Ser- 


Gly- 


Gly- 


Ala- 


Trp - 


Thr- 


Trp - 


Asn - Ala - Tyr - Ala - 


Phe- 




Ala- 


Ala- 


Pro- 


Ser- 


Gly- 


Glu- 


-Gly- 


Ser (SEQ ID NO: 9); 




G) 


Ser- 


Gly- 


Gly- 


Ala- 


Trp- 


Thr- 


Trp — 


Asn - Ala - Tyr -. Ala - 


Phe- 

• 




Ala- 


Ala- 


Pro- 


Ser- 


Gly- 


Gly- 

• 


Gly- 


Trp (SEQ ID NO: 10); 
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4 

(k) Ser - Gly - Gly - Ala - Trp - Thr -Trp - Asn - Ala - Tyr - Ala - Phe - 

Ala - Ala - Pro - Ser - Gly - Gly - Cys - Ser (SEQ ID NO: 1 1); 
(I) Ser - Gly - Gly - Ala - Trp - Thr - Trp - Asn - Ala - Tyr - Ala - Phe - 

Ala - Ala - Pro - Ser - Gly - Gly - Asp - Ser (SEQ ID NO: 12); 
(m) Ser - Gly - Gly - Ala - Trp - Thr - Trp - Asn - Ala - Tyr - Ala - Phe - 

Ala - Ala - Pro - Ser - Gly - Gly - Arg - Ser (SEQ ID NO: 1 3); 
(n) Ser - Gly - Gly - Ala - Trp - Thr - Trp - Asn - Ala - Phe - Ala - Phe 

_ Ala - Ala - Pro - Ser - Gly - Gly - Gly - Ser (SEQ ID NO: 14); or 

* ■ 

* * • 

(o) a nucleic acid molecule, the complementary strand of which hybridizes 

\ under stringent conditions to the nucleic acid molecule of any one of 

• » 
. (a) v to (n), wherein said nucleic acid molecule encodes an interaction 

partner which is capable of modulating the activity of a nucleic acid 
binding protein. r - 

* • 

A (poly)peptide encoded by the nucleic acid sequence of claim 10. 

An expression vector, comprising an expression control sequence, a multiple 
cloning site for inserting a gene of interest and the nucleic acid molecule of 
claim 10, wherein the gene of interest is inserted in-frame with the ORF 

* • 

encoding the peptide. . 

< 

A host cell containing the nucleic acid molecule or the expression vector of 

• ■ 

claim 12. 

The host cell of claim 13, wherein the nucleic acid molecule is fused in frame 
to at least one chromosomal sequence encoding a (poly) peptide. 

* * 

The host cell of claim 13 or 14, also containing 

(a) the coding sequence of a reporter protein which is under .control of a 
nucleic acid binding protein; and 

(b) . the coding sequence of the nucleic acid binding protein of (a), 
wherein said coding sequences are operatively linked to expression control 
sequences. 



The host cell of any one of claims 13 to 15, wherein the nucleic acid binding 
protein is a repressor of transcription. 

» • 

The host cell of claim 1.6, wherein the nucleic acid binding protein is a 
(polypeptide comprising the (poly)peptide sequence of 

(a) Tetracycline repressor; 

(b) Lac repressor. 

(c) Xylose repressor; ' 

(d) , AraC protein 

» » * 

(e) TetR-based T-Rex system; 

(f) Erythromycin-specific repressor MphR(A); : 
(9) Pip (pristinamycin interacting protein); 

(h) ScbR of Streptomyces coelicolor, , 

' . •• 

(i) TraR of Agrobacterium tumefaciens fused to the. eukaryotic activation 
domain p65. of NFkB; or ' 

0) chimeric proteins consisting of the: 

(i.) Gal4 DNA-binding domain and either (1) a full-length PhyA 
protein (PhyA-GBD) or (2) the N-terminal photosehsory domain 
of PhyB [PhyB(NT)-GBD] of Arabidopsis thaliana; 

« 

(ii.) a steroid hormone regulated system consisting of (1) a Gal4 
DNA-binding domain fused to a human progesterone receptor 
ligand binding domain and an NFicB-derived p65 eukaryotic 
transcription activation domain and (2) the inducer mifepristone; 

(iii.) a dimerizer system consisting of (1) a ZFHD1 DNA-binding 
domain fused to FKBP 12, (2) FRAP fused to the NFkB-derived 
p65 activation domain and (3) rapamycin, AP22565 or AP12967 
as heterodimer-forming agents; 

(iv.) an Ecdysone-lnducible Expression System containing (1) a 
modified form of the Drosophifa ecdysone receptor (VgEcR) 
fused to a VP 16 activation domain, (2) the mammalian 
homologue RXR of ultraspiracle, the natural binding partner of 
the ecdysone receptor in Drosophila and (3) the inducers 
pqnasterdne A and muristerone A, 
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* 

5 18. The host cell of any one of claims 15 to 17, wherein the nucleic acid binding 

protein is an enhancer or activator of transcription. 

■ 

■ • . . ■. • 

'19. An ensemble of host cells of any one of claims 13 to 18, wherein said 
ensemble comprises two or more cells, each of which contain at least one 
nucleic acid molecule fused in frame to an open, reading frame encoding a 

• * 

10 (pbly)peptide. 

20 The ensemble of host cells of claim 19, wherein said ensemble contains 

■ * 

subpopulatibns with different open reading frames being fused to said nucleic 
acid molecule. 

21 . The ensemble of host cells of claim 20, wherein the sum of said open reading 
1 5 frames forms the proteome of the host cell. 

■ + 

♦ . 

* 

22. A non-human animal comprising the host cell of any one of claims 1 3 to. 1 8 or 
the ensemble of any one of claims 19 to 21 . 

* 

. 23. A kit comprising 

(a) the nucleic acid molecule of claim 10; 

» 

20 (b) the (poly)peptide of claim 11; 

* ■ • 

(c) the vector of claim 12; 

(d) the host cell or ensemble of host cells of any one of claims 13 to 22; 
and/or 

(e) instructions for use; 
25 in one or more containers, 

♦ ■ 

24. Use of the (poly)peptide of claim 11, the nucleic acid molecule of claim 10, 
the expression vector of claim 12 or the host cell or ensemble of host cells of 
any one of claims 13 to 22 for monitoring expression of a gene.. 
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Abstract 

The present invention relates a method for monitoring the expression level of a gene 
in a host cell by modulating the activity of a regulatory biomolecule, comprising the 
steps of: (a) transforming a cell expressing a regulatory biomolecule with a nucleic 
acid molecule comprising an open reading frame encoding an interaction partner of 
said biomolecule in expressible form, wherein (i) said regulatory biomolecule is 
either a nucleic acid binding molecule that effects its regulatory activity upon binding 
or an allosterically controlled ribonucleic acid molecule; and (ii) the interaction 
partner of the biomolecule is encoded by a nucleic acid molecule comprising: (1) a 
nucleic acid sequence encoding a tagged (poly)peptide, or (2) a nucleic acid 
sequence encoding a tagged (polypeptide or a peptide tag, a selectable marker 
gene and additional nucleotide sequences for site specific, in-frame integration of 
said nucleic acid molecule into the coding sequence . <of at least one host 
(poly)peptide of interest, wherein said tag comprises the interacting residues of the 
interaction partner and (b) assessing the expression level of the gene. Furthermore, 
the present invention relates to a method of producing and/or selecting a compound 
capable of modulating the activity of a nucleic acid binding, protein comprising the 
steps of: (a) conducting a selection of compounds with the. nucleic acid binding 
target protein under conditions allowing an interaction of the compound and the 
nucleic acid binding protein; (b) removing unspecifically bound compounds; (c) 
detecting specific binding of compounds, to the nucleic acid binding target protein; 
(d) expressing in a cell, the nucleic acid binding protein and providing in trans the 
coding sequence of at least one indicator gene, wherein said coding sequence is 
under control of the target sequence of the nucleic acid binding protein; (e) adding a 
candidate compound to the cell of step (d); (f) determining the amount or activity of 
the indicator protein, wherein a reduced or increased amount of indicator protein is 
indicative of compounds, capable of modulating the activity of the nucleic acid 
binding protein; (g) selecting compounds capable of modulating the activity of the 
nucleic acid binding protein. Moreover, the present invention relates to nucleic acid 
molecules, polypeptides, expression vectors, host cells, ensembles of host cells and 
a non-human animal comprising said host cells. Finally, the present invention 
relates to a kit comprising the nucleic acid molecule, the (poly)peptide, the vector, 
the host cell or the ensemble of host cells of the present invention in one or more 
containers. 
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Figure 1: Experimental procedure for the in vitro selection. 
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Figure 2: Example for in vitro selected sequences. 



Unspecific elution (Gly buffer, pH 2.2) 



pepl Trp - His - Gly - Ala - lie - Leu 

pep2 Leu- - Pro Ser - Tyr - Met - Leu 

Pep 3 Ala - His - Tyr - Ser - Leu - Tyr 

pep4 Tyr - His - Asn - Leu - Tyr - Gly 

pep5 Trp - His - Gin - Thr - Tyr - Thr 



Gly - Ser - Ala - Arg - Ala - Gin 
His - Leu - Trp - Ser - Pro - His 
Trp — Pro - Met - Ala - 'thr - Phe 
Leu - Pro - Leu - Gly - Pro - Trp 
Ser - Ser - Leu - Trp - Glu - Ser 



Specific elution (TetR, 4pM) 



pepl. Trp - Thr - Trp - Asn - Ala - Tyr 

pep2 Trp - His - Ser - Ser - Phe - Asn 

pep3 Trp . - His - Leu - Pro - Leu - Ser 

pep4 Trp - His - Thr - Pro - lie - Ser 

pep5 Trp - His - Trp - Thr -' Phe - Ser 



Ala - Phe - Ala. - Ala - Pro - Ser 

Met - Phe - Ala - Tyr - Pro - Met 

Trp - Thr r- Thr - Arg - Leu -/Pro 

Leu - Leu - Lys - Gin - Val - Arg 

Ser - Pro - Leu - Met - Gin'' - Thr 
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Figure 3: Characterisation of TetR-phage binding by ELISA. 
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Figure 4: Design of the peptide expressing construct. 
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Figure 5: Setup of the in vivo screening system. 




v « 



\ 



6/17 



Figure 6: McConkey plate. 
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Figure 7: LacZ assay for the TetR-inducing fusion protein TrxA-pepBsl. 
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* - * 

Figure 8: Identification of the region of interaction between TetR and TrxA-pepBsl by 
in vivo epitope mapping. 
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Figure 9: Structure of TetR. 
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Figure 10: Expression of the peptide correlates with induction of TetR. 
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Figure 11 : In vivo characterisation of non-inducible TetR mutants. 
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Figure 12: Position of the amino acids H64, N82 and F86 relative to tetracycline and the 
interaction epitope. 
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Figure 13: Amino acids contacting tc. 
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Figure 14: In vivo characterisation of TetR inducibility by TrxA fusion proteins. 
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Figure 15: Correlation between the protein level and induction of f etR(B). 
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Figure 16: Comparison of a low and high TetR-expressing system. 



7,0E+03 



6.0E+03 



S 5,0E-K)3 
o 

« 4.0E+03 
'55 

3,0E+03 

O 2,0E+03 
- 1 

1,0E+03 



0,0E+O0 - 



(UpWH14l1-system 
^pWH510-system 




X 




'ft 

Sic I 



J 



1 

3* 

V ■ 



•nr. 

> * ' 

1-5 

H 

■ • « 

'Hi 



/ 



/ 



17/17 

/ 

Figure 17: Comparison of TetR(B) induction by C- and N-terrainal TrxA-peptide 
fusions. 
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EPO- Munich 

SEQUENCE LISTING .17 
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<110> Friedrich-Alexander-Universitat Erlangen-Niimberg 

<120> PEPTIDE— BASED METHOD FOR MONITORING GENE EXPRESSION IN A HOST CELL 

■ * 

<130> H 1776 EP 

■ 

<160> 14 

<170> Patentln version 3.1 . 
<210> 1 

* 

<211> 17 

* ■ 

♦ * 

<212> PRT 



<213> artificial sequence ^ 

f • 

« 



i 



<220> 



<223> /note=»Description of artificial sequence; polypeptides with affinity to 
the Tet- repressor" - f' % 

( ' 

<400>- 1 

Met Trp Thr Trp Asn Ala.Tyr Ala Phe Ala Ala Pro Ser Gly Gly Gly 
1.5 1° 15 

4 

* 

- 

Ser 



<210> 2 

<211> 16 ■ * 

# « 

<212> PRT 

4 

* 

<213> artificial sequence • 
<220> 

<223> /note^Description of artificial sequence: polypeptides with affinity to 
the Tet repressor w 

<400> 2 . 

Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ser Gly Gly Gly Ser 
± 5 10 15 

<210> 3 
<211> 12 
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<212> PRT 

<213> artificial sequence 

i 

<220> 

* 

* • 

<223> /note- "Description of artificial sequence: polypeptides with affinity to 
the Tet repressor" . ■ lcy co 

<400> 3 

Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ser 
1 5 '10 

* " 

<210> 4 - 

* 

<211> 20 . 
<212> PRT 

* 

• * * 

<213> artificial sequence 

♦ 

<220> 

* ■ t r 

ll 2 Jl / note=UDes ^ ri P tion of artificial sequence: polypeptides -with affinity to 
the Tet repressor w *- * 

<400> 4 



Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ser 
1 5 • . io 15 

Gly 'Gly Gly Ser ' 

20 

■ 

■ 

<210> 5 

- * 

<211> 20 

♦ 

<212> PRT 

<213> artificial sequence 

<220> . • 

<223> Vnote-«Descr±ption of artificial sequence: polypeptides with affinity to 
the Tet repressor" 

<400> 5 

Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe' Ala Ala Ala Ser 
1 5 10 15 

■ 

Gly Gly Gly Ser 

20 

<210> 6 . 
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« • 

* 

<2ll> 20 ' . 

* * 

<212>. PRT 

• » 

<213> artificial sequence 

<220> ' 

<223> /note="Description of artificial sequence: polypeptides with affinity to 
the Tet repressor tt 

■ ■ ' 

<400> 6 " . 1 ' 

: . 

Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ala . 
1 5 10 15 

■ 

Gly Gly Gly- Ser •• 

20 . 

.- 

/ 

<210> 7 /• 

f 

<211> 20 / 

( 

/ 

<212> PRT • , ' 

- 

<213> artificial sequenqe f f . 

<220> / • 

<223> /note= tt Description of artificial sequence: polypeptides with affinity to 
the Tet repressor" . . • 

<400> 7* . * . 

■ 

Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ser ' 
1 5 10 15 



Gly Arg Gly Ser 

20 

* • • • 

<210> 8 

<211> 20 • 

<212> PRT • 

<213> artificial sequence. 
<220> 

<223> /nbte= tt Description of artificial sequence: polypeptides with affinity to 
the Tet repressor u 

<400> 8 

» 

Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ser 
15 10 15 
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Asp Gly Gly Leu 

20 



<210> 9 * . 

■ . 
<211> 20 

<212> PRT 

* 

<213> artificial sequence 

<220> , ' 

Se^et^eSsSor^^" 1011 ° f artificial Polypeptides with affinity to 

<400> 9 

Ser Gly Gly Ala 'Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ser 
1 5 10- 15 



Asp Gly Gly Leu 

20 



/ 



<210> 10 
<211> 20 
<212> PRT 



/ 



<213> artificial sequence . . 

* ♦ 

<220> " 

* 

tne^et^e^^ior-" 1 " 5 " 1011 ^ artificial polypeptides ' with affinity to 

* • 

<400> 10 . 

Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ser 
1 5 10 15 

* 

Gly Gly Gly Trp • . • 

20 



<210> 11 



<211> 20 



<212> PRT 



<213> artificial sequence 



<220> 



So 3 L^ /n ° te=nDeSCription ° f artif icial sequence: polypeptides with affinity to 
the Tet repressor vv 
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<400> 11 

Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe Alalia Pro Ser 
15 • 10- . 15 



Gly Gly Cys Ser 

20 



<210> 12 



<211> 20 



<212> PRT 



<213> artificial sequence 



<220>- i 

t 

<223> /note= u Description of artificial sequence: polypeptides, with affinity to 
the Tet repressor" ' , 



<400> 12 



Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe Ala- Ala Pro Ser 
1 5 10 . 15 . 



r 



Gly Gly Asp Ser . 

20 



<210> 13 

* • * * 

* 

<211> 20 . 
<212> PRT 

<213> artificial sequence 

<220> < • 

<223> /note= w Description of artificial sequence: polypeptides with affinity to 
the Tet repressor xv 

<400> 13 ' '. * • 

* * 

Ser Gly Gly Ala Trp Thr Trp Asn Ala Tyr Ala Phe Ala Ala Pro Ser 
1-5 10 . 15 



Gly Gly Arg Ser 

2 0 



<210> 14 



<211> 20 



<212> PRT 
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* 

<213> artificial sequence 

<220> * " 

<223> /note=»Description of artificial sequence: polypeptides' with 
the- Tet repressor" 

<400> 14 ' . 

Ser Gly Gly Ala Trp Thr Trp Asn Ala Phe Ala Phe Ala Ala Pro Ser 
1-5 10 15 

■ * 

• » 

Gly Gly Gly Ser- ' • 

20 ' 



